如何解决使用Gensim进行主题建模
我一直在尝试使用gensim在Python中进行主题建模。我有以下数据集:
文档
"Sugar is bad to consume. My sister likes to have sugar,but not my father."
"My father spends a lot of time driving my sister around to dance practice."
"Doctors suggest that driving may cause increased stress and blood pressure."
"Sometimes I feel pressure to perform well at school,but my father never seems to drive my sister to do better."
"Health experts say that Sugar is not good for your lifestyle."
我尝试如下对它进行词法化:
texts = map(gensim.utils.lemmatize,Docs)
并运行LDA:
dictionary = gensim.corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(doc) for doc in texts]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(corpus,num_topics=3,id2word = dictionary,passes=50)
ldamodel.print_topics()
但是我遇到一个错误。你知道如何解决吗?
谢谢
错误:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-15-b36df3b5374b> in <module>
----> 1 import pattern
2
3 dictionary = gensim.corpora.Dictionary(Docs)
4 corpus = [dictionary.doc2bow(doc) for doc in Docs]
5 Lda = gensim.models.ldamodel.LdaModel
ModuleNotFoundError: No module named 'pattern'
整个错误消息:
---> 3 dictionary = gensim.corpora.Dictionary(Docs)
4 corpus = [dictionary.doc2bow(doc) for doc in Docs]
5 Lda = gensim.models.ldamodel.LdaModel
/anaconda3/lib/python3.7/site-packages/gensim/corpora/dictionary.py in __init__(self,documents,prune_at)
82
83 if documents is not None:
---> 84 self.add_documents(documents,prune_at=prune_at)
85
86 def __getitem__(self,tokenid):
/anaconda3/lib/python3.7/site-packages/gensim/corpora/dictionary.py in add_documents(self,prune_at)
195
196 """
--> 197 for docno,document in enumerate(documents):
198 # log progress & run a regular check for pruning,once every 10k docs
199 if docno % 10000 == 0:
/anaconda3/lib/python3.7/site-packages/gensim/utils.py in lemmatize(content,allowed_tags,light,stopwords,min_length,max_length)
1676 if not has_pattern():
1677 raise ImportError(
-> 1678 "Pattern library is not installed. Pattern library is needed in order to use lemmatize function"
1679 )
1680 from pattern.en import parse
ImportError: Pattern library is not installed. Pattern library is needed in order to use lemmatize function
解决方法
尝试安装pattern软件包。这需要存在。
pip install pattern
Gensim utils.py使用此验证功能:
def has_pattern():
"""Check whether the `pattern <https://github.com/clips/pattern>`_ package is installed.
Returns
-------
bool
Is `pattern` installed?
"""
try:
from pattern.en import parse # noqa:F401
return True
except ImportError:
return False
我确实注意到此软件包在pip install gensim
期间未通过验证,这很不清楚。
Collecting gensim
Using cached https://files.pythonhosted.org/packages/70/cf/87b25b265d23498b2b70ce873495cf7ef91394c4baff240210e26f3bc18a/gensim-3.8.3-cp37-cp37m-macosx_10_9_x86_64.whl
Requirement already satisfied: numpy>=1.11.3 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from gensim) (1.17.2)
Requirement already satisfied: scipy>=0.18.1 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from gensim) (1.3.1)
Requirement already satisfied: six>=1.5.0 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from gensim) (1.12.0)
Collecting smart-open>=1.8.1 (from gensim)
Collecting boto3 (from smart-open>=1.8.1->gensim)
Using cached https://files.pythonhosted.org/packages/c4/24/b9facc760789cf844880c178b64d26d9f4a0ef06af3e99473f38fba94657/boto3-1.14.56-py2.py3-none-any.whl
Requirement already satisfied: requests in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from smart-open>=1.8.1->gensim) (2.22.0)
Requirement already satisfied: boto in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from smart-open>=1.8.1->gensim) (2.49.0)
Collecting jmespath<1.0.0,>=0.7.1 (from boto3->smart-open>=1.8.1->gensim)
Using cached https://files.pythonhosted.org/packages/07/cb/5f001272b6faeb23c1c9e0acc04d48eaaf5c862c17709d20e3469c6e0139/jmespath-0.10.0-py2.py3-none-any.whl
Collecting s3transfer<0.4.0,>=0.3.0 (from boto3->smart-open>=1.8.1->gensim)
Using cached https://files.pythonhosted.org/packages/69/79/e6afb3d8b0b4e96cefbdc690f741d7dd24547ff1f94240c997a26fa908d3/s3transfer-0.3.3-py2.py3-none-any.whl
Collecting botocore<1.18.0,>=1.17.56 (from boto3->smart-open>=1.8.1->gensim)
Using cached https://files.pythonhosted.org/packages/b1/82/499909b818bddde1a4fc1228389d9d29cc2ede766a2a7370aed033dd07f9/botocore-1.17.56-py2.py3-none-any.whl
Requirement already satisfied: certifi>=2017.4.17 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from requests->smart-open>=1.8.1->gensim) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from requests->smart-open>=1.8.1->gensim) (1.24.2)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from requests->smart-open>=1.8.1->gensim) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from requests->smart-open>=1.8.1->gensim) (2.8)
Requirement already satisfied: docutils<0.16,>=0.10 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from botocore<1.18.0,>=1.17.56->boto3->smart-open>=1.8.1->gensim) (0.15.2)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /Users/username/opt/anaconda3/lib/python3.7/site-packages (from botocore<1.18.0,>=1.17.56->boto3->smart-open>=1.8.1->gensim) (2.8.0)
Installing collected packages: jmespath,botocore,s3transfer,boto3,smart-open,gensim
Successfully installed boto3-1.14.56 botocore-1.17.56 gensim-3.8.3 jmespath-0.10.0 s3transfer-0.3.3 smart-open-2.1.1
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。