共指消解/指代消解 spacy+neuralcoref 排坑_资讯

共指消解/指代消解 spacy+neuralcoref 排坑

创始人

2024-06-01 20:41:07

0次

背景

模型需要用到指代消解功能，查询到比较好用的是spaCy+neuralcoref。代码简单，但是出现了很多兼容性问题，neuralcoref相当于一个插件，在spaCy框架下实现指代消解，安装很多版本都不能顺利运行，各种error。最终找到一个匹配的版本。

问题

Segmentation fault (core dumped)
KeyError: “[E002] Can’t find factory for ‘tok2vec’. This usually happens when spaCy calls nlp.create_pipe with a component name that’s not built in - for example, when constructing the pipeline from a model’s meta.json. If you’re using a custom component, you can write to Language.factories['tok2vec'] or remove it from the model meta and add it via nlp.add_pipe instead.”
ValueError: spacy.strings.StringStore size changed, may indicate binary incompatibility. Expected 80 from C header, got 64 from PyObject

解决

版本

python 3.8.16
spacy 2.1.0
neuralcoref 4.0
en_core_web_sm 2.1.0

其中en_core_web_sm是英文模型，可以用python -m spacy validate 查看spacy对应版本的模型版本。

包源码下载地址：

spacy：https://github.com/explosion/spaCy/tags?after=v3.4.3
en_core_web_sm：https://github.com/explosion/spacy-models/releases
neuralcoref：内网下比较容易安装

example

import spacy
import neuralcoref
# Load English tokenizer, tagger, parser and NER
nlp = spacy.load("en_core_web_sm")
neuralcoref.add_to_pipe(nlp)
# Process whole documents
text = ("When Sebastian Thrun started working on self-driving cars at ""Google in 2007, few people outside of the company took him ""seriously. “I can tell you very senior CEOs of major American ""car companies would shake my hand and turn away because I wasn’t ""worth talking to,” said Thrun, in an interview with Recode earlier ""this week.")
doc = nlp(text)for c in doc._.coref_clusters:print(c)# Analyze syntax
print("Noun phrases:", [chunk.text for chunk in doc.noun_chunks])
print("Verbs:", [token.lemma_ for token in doc if token.pos_ == "VERB"])# Find named entities, phrases and concepts
for entity in doc.ents:print(entity.text, entity.label_, entity.start_char, entity.end_char)

词库加载错误:未能找到文件“E:\highferrum_mysql\Configuration\Dict_Stopwords.txt”。

上一篇：MDPs —— 马尔可夫决策定义与算法

下一篇：【pytorch】nn.Embedding()

共指消解/指代消解 spacy+neuralcoref 排坑

背景

问题

解决

版本

包源码下载地址：

example

相关内容

热门资讯