如何解决将lxml.etree._ElementTree对象存储在数据帧中:TypeError:无法腌制lxml.etree._ElementTree对象
我尝试将lxml.etree._ElementTree对象存储在数据框中。不幸的是,熊猫无法识别这些物体。有没有办法将它们存储在数据框中,或者有没有其他方法将所有信息存储在单个文件中,具有良好的读写速度和文件大小?
以下是重新创建错误的示例:
import pandas as pd
import lxml
from lxml import etree
s = '''<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>'''
doc = etree.fromstring(s)
root = etree.ElementTree(doc)
df = pd.DataFrame(data = [["name1","date1",root]],columns = ["name","date","root"])
df.to_pickle(r"D:\test\test.pkl")
# TypeError: can't pickle lxml.etree._ElementTree objects
跟踪:
Traceback (most recent call last):
File "<...>",line 2,in <module>
df.to_pickle(r"D:\test\test.pkl")
File "...\Anaconda\envs\...\lib\site-packages\pandas\core\generic.py",line 2771,in to_pickle
to_pickle(self,path,compression=compression,protocol=protocol)
File "...\Anaconda\envs\...\lib\site-packages\pandas\io\pickle.py",line 76,in to_pickle
f.write(pickle.dumps(obj,protocol=protocol))
TypeError: can't pickle lxml.etree._ElementTree objects
解决方法
对于将来的读者,请执行以下操作对其进行修复:
df["root"] = df["root"].map(lambda x: etree.tostring(x,encoding='utf8',method='xml'))
df.to_pickle(r"D:\test\test.pkl")
df = pd.read_pickle(r"D:\test\test.pkl")
df["root"] = df["root"].map(etree.fromstring).map(etree.ElementTree)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。