如何解决Python:从文本中提取主题及其相关短语
我正在尝试关注线程(How to extract subjects in a sentence and their respective dependent phrases?)。我还想从文本中提取主题及其依赖项。
import spacy
from textpipeliner import PipelineEngine,Context
from textpipeliner.pipes import *
text = 'No Offline Maps! It used to have offline maps but they disappeared. It now has a menu option to watch a video in exchange for maps but it never downloads the map. Makes the app useless to me.'
pipes_structure = [
SequencePipe([
FindTokensPipe("VERB/nsubj/*"),NamedEntityFilterPipe(),NamedEntityExtractorPipe()
]),FindTokensPipe("VERB"),AnyPipe([
SequencePipe([
FindTokensPipe("VBD/dobj/NNP"),AggregatePipe([
NamedEntityFilterPipe("GPE"),NamedEntityFilterPipe("PERSON")
]),NamedEntityExtractorPipe()
]),SequencePipe([
FindTokensPipe("VBD/**/*/pobj/NNP"),AggregatePipe([
NamedEntityFilterPipe("LOC"),NamedEntityExtractorPipe()
])
])
]
engine = PipelineEngine(pipes_structure,Context(text),[0,1,2])
engine.process()
当我运行上面的代码时,它引发以下错误:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-5f5a5c9e8e51> in <module>()
----> 1 engine = PipelineEngine(pipes_structure,2])
2 engine.process()
~/anaconda3/lib/python3.6/site-packages/textpipeliner/context.py in __init__(self,doc)
4 self._current_sent_idx = -1
5 self._paragraph = self._sents[0:9]
----> 6 for s in doc.sents:
7 self._sents.append(s)
8 self.doc = doc
AttributeError: 'str' object has no attribute 'sents'
我不确定我在哪里犯错。任何人都可以帮助纠正该问题吗?
解决方法
有趣的库
您的上下文需要是一个不同的对象。该错误明确表示。检查软件包官方example:
nlp = spacy.load("en")
text = nlp('No Offline Maps! It used to have offline maps but they disappeared. It now has a menu option to watch a video in exchange for maps but it never downloads the map. Makes the app useless to me.')
,
您似乎正在将字符串作为text
变量传入此行
engine = PipelineEngine(pipes_structure,Context(text),[0,1,2])
将第4行替换为
nlp = spacy.load("en")
text = nlp('No Offline Maps! It used to have offline maps but they disappeared. It now has a menu option to watch a video in exchange for maps but it never downloads the map. Makes the app useless to me.')
这是他们在您引用的帖子中所做的。
这种text
不是字符串,而是nlp函数吐出的任何类型,因此它在倒数第二行工作。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。