如何解决如何提取名词和形容词对,包括连词
背景
我想使用NLP库(例如spaCy)提取名词和形容词对。
预期的输入和输出如下。
The pink,beautiful,and small flowers are blown away.
{'flowers':['pink','beautiful','small']}
I got a red candy and an interesting book.
{'candy':['red'],'book':['interesting']}
问题
在回答类似问题How to extract noun adjective pairs from a sentence之后,我用输入执行了程序。
但是,它没有返回任何输出。
[]
代码
import spacy
nlp = spacy.load('en')
doc = nlp('The beautiful and small flowers are blown away.')
noun_adj_pairs = []
for i,token in enumerate(doc):
if token.pos_ not in ('NOUN','PROPN'):
continue
for j in range(i+1,len(doc)):
if doc[j].pos_ == 'ADJ':
noun_adj_pairs.append((token,doc[j]))
break
print(noun_adj_pairs)
试用版
我试图编写新代码,但我仍然想着如何处理带有连词的形容词。
input
I got a red candy and an interesting book.
output
{'candy': 'red','book': 'interesting'}
input
The pink,and small flowers are blown away.
output
{'flowers': 'small'}
试用代码
import spacy
nlp = spacy.load('en')
doc = nlp('I got a red candy and an interesting book.')
noun_adj_pairs = {}
for word in doc:
if word.pos_ == 'ADJ' and word.dep_ != "cc":
if word.head.pos_ =="NOUN":
noun_adj_pairs[str(word.head.text)]=str(word.text)
print(noun_adj_pairs)
环境
Python 3.6
解决方法
您不妨尝试noun_chunks
:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp('I got a red candy and an interesting and big book.')
noun_adj_pairs = {}
for chunk in doc.noun_chunks:
adj = []
noun = ""
for tok in chunk:
if tok.pos_ == "NOUN":
noun = tok.text
if tok.pos_ == "ADJ":
adj.append(tok.text)
if noun:
noun_adj_pairs.update({noun:adj})
# expected output
noun_adj_pairs
{'candy': ['red'],'book': ['interesting','big']}
您是否希望包含连词:
noun_adj_pairs = {}
for chunk in doc.noun_chunks:
adj = []
noun = ""
for tok in chunk:
if tok.pos_ == "NOUN":
noun = tok.text
if tok.pos_ == "ADJ" or tok.pos_ == "CCONJ":
adj.append(tok.text)
if noun:
noun_adj_pairs.update({noun:" ".join(adj)})
noun_adj_pairs
{'candy': 'red','book': 'interesting and big'}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。