如何解决无法从 spacy 中的拥抱脸模型存储库初始化模型
我有一个 ner 项目,我想将 spacy 的管道组件用于 ner 和从转换器中的预训练模型生成的词向量。我使用 spacy 的 spacy-transformer 并跟随他们的公会,但它不起作用。
我正在使用 spacy-2.3.5、transformer-0.6.2、python-2.3.5 并尝试在 colab 中运行它。
spacy-transformer GitHub 链接:Link to git hub
转换器中模型的链接:Link to vinai/phobert-base
转换中的模型名称:vinai/phobert-base
我有一个问题:我们是否可以通过 spacy-transformer 或仅使用某种模型在转换器中使用任何预训练模型?
在他们的公会中,在 spacy 中加载预训练模型之前,我们需要对其进行初始化。 here their guild
! export CUDA_PATH="/opt/nvidia/cuda"
! pip install -U spacy[cuda101]
! pip install spacy-transformers
! git clone -b v0.6.x https://github.com/explosion/spacy-transformers
! python /content/spacy-transformers/examples/init_model.py -n "vinai/phobert-base" \
-l vi vi_vinai_phobert_base
我收到了一个日志:
2020-12-25 03:46:18.163785: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
ℹ Creating model for 'vinai/phobert-base' (vi)
Downloading: 100% 557/557 [00:00<00:00,689kB/s]
⠼ Setting up the pipeline...
Traceback (most recent call last):
File "/content/spacy-transformers/examples/init_model.py",line 32,in <module>
plac.call(main)
File "/usr/local/lib/python3.6/dist-packages/plac_core.py",line 367,in call
cmd,result = parser.consume(arglist)
File "/usr/local/lib/python3.6/dist-packages/plac_core.py",line 232,in consume
return cmd,self.func(*(args + varargs + extraopts),**kwargs)
File "/content/spacy-transformers/examples/init_model.py",line 19,in main
nlp.add_pipe(TransformersWordPiecer.from_pretrained(nlp.vocab,name))
File "/usr/local/lib/python3.6/dist-packages/spacy_transformers/pipeline/wordpiecer.py",line 26,in from_pretrained
model = get_tokenizer(trf_name).from_pretrained(trf_name)
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py",line 393,in from_pretrained
return cls._from_pretrained(*inputs,**kwargs)
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py",line 496,in _from_pretrained
list(cls.vocab_files_names.values()),OSError: Model name 'vinai/phobert-base' was not found in tokenizers model name list (roberta-base,roberta-large,roberta-large-mnli,distilroberta-base,roberta-base-openai-detector,roberta-large-openai-detector). We assumed 'vinai/phobert-base' was a path,a model identifier,or url to a directory containing vocabulary files named ['vocab.json','merges.txt'] but couldn't find such vocabulary files at this path or url.
任何人都可以帮助我或给我建议吗?非常感谢!
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。