如何解决根据子元素值删除父节点
我有很多 xml 文件,如下面的示例输入文件
这是一个输入文件
<a>
<header>
fruit
</header>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
<fruitlist>
<d>apple</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>lemon</d>
</fruitlist>
<fruitlist>
<d>tomato</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>lemon</d>
</fruitlist>
<fruitlist>
<d>kiwi</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>strawberry</d>
</fruitlist>
</b>
</a>
我的代码是这样的:
def removebanana(diretories):
xmlFiles = diretories + "/*.xml"
dirloc = directories + "/result"
for fname in glob.glob(xmlFiles):
name = os.path.basename(fname)
content = open(fname,"rt",encoding="utf-8",errors="ignore")
root = tree.getroot()
for b in root.findall("b"):
dlist = []
for b.find("d") is not None:
d = str(drug.find("d").text)
dlist.append(d)
for dd in dlist:
dd = dd.strip()
if dd.lower() == "banana":
cnt += 1
if cnt == 0:
root.remove(b)
num += 0
filename = f"{dirloc}/{name}"
cnt += 1
tree.write(filename)
然而,结果与示例输入文件相同
我想得到的是消除子元素节点中不包括香蕉值的节点。 所以,这就是我想要的:
<a>
<header>
fruit
</header>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
<fruitlist>
<d>apple</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
</b>
</a>
我应该如何修复我的代码?
解决方法
如果我理解正确,这就是你需要做的:
fruits = """[your code above]"""
import xml.etree.ElementTree as ET
tree = ET.fromstring(fruits)
targets = tree.findall('.//b')
for target in targets:
f_list= [t.text for t in target.findall('.//d')]
if not "banana" in f_list:
tree.remove(target)
print(ET.tostring(tree).decode())
#to write to file:
tree = ET.ElementTree(tree)
tree.write("test.xml",encoding="utf-8")
输出:
<a>
<header>
fruit
</header>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
<fruitlist>
<d>apple</d>
</fruitlist>
</b>
<b>
<fruitlist>
<d>banana</d>
</fruitlist>
</b>
</a>
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。