如何解决在Python中使用ElementTree创建多级层次结构XML树

因此，给定输入为txt文件，我们必须创建XML树或任何层次结构来简化解析。然后找到最后一位首席执行官的最后一位员工。

该txt给出了公司的结构，其中各列的顺序如下：名称，薪水，雇主。那些以“ NOBODY”作为雇主的人是公司的CEOS。拥有雇主名字的人在上述雇主的工作。 txt看起来像这样：

Vineel Phatak,520,NOBODY
Ajay Joshi,250,Vineel Phatak
Abhishek Chauhan,120,Ajay Joshi
Jayesh Godse,500,NOBODY
Vijaya Mundada,60,Abhishek Chauhan
Shital Tuteja,45,Jayesh Godse
Rajan Gawli,700,Vineel Phatak
Zeba Khan,300,Jayesh Godse
Chaitali Sood,100,Zeba Khan
Sheila Rodrigues,35,Vineel Phatak

鉴于此，我们必须完成以下工作：

Company
->Vineel Phatak
-->Ajay Joshi
--->Abhishek Chauhan
---->Vijaya Mundada
-->Rajan Gawli
-->Sheila Rodrigues

->Jayesh Godse
-->Shital Tuteja
-->Zeba Khan
--->Chaitali Sood

以XML格式：

<company>
    <Vineel Phatak>
        <Ajay Joshi>
            <Abhishek Chauhan>
                <Vijaya Mundada />
            </Abhishek Chauhan>
        </Ajay Joshi>
        <Rajan Gawli />
        <Sheila Rodrigues />
    </Vineel Phatak>

    <Jayesh Godse>
        <Shital Tuteja />
        <Zeba Khan>
            <Chaitali Sood />
        </Zeba Khan>
    </Jayesh Godse>
</company>

我尝试做的是创建一个名为company的元素后，由于我们需要将子元素添加到root（company）中，因此我尝试生成这些元素并将其追加到列表中。然后解析列表并进行比较以获取值。

# Find last employee of the last introduced CEO
import xml.etree.ElementTree as ET

# Reading Input
inD = open('input.txt','r')
data = inD.readlines()
inD.close()

# Creating an element and saving all subelement to list
all_element = []
company = ET.Element('Company')
ceos = []
for i in data:
    t = i.strip().split(',')
    if(t[2].strip() == 'NOBODY'):
        ceos.append(t[0])
    all_element.append(ET.SubElement(company,t[0]))
# company.clear()
# Creating a function to add subelements
def findChilds(name,emp):
    global all_element
    for i in all_element:
        if emp == i.tag:
            name = ET.SubElement(i,name)

# If it is CEO hence no emplyer then directly add subelement to company or else add to the previous subelement
for j in data:
    t = j.strip().split(',')
    if t[2].strip() == 'NOBODY':
        e = ET.SubElement(company,t[0])
    elif t[2].strip() != 'NOBODY':
        findChilds(t[0].strip(),t[2].strip())
        
ET.dump(company)

结果如下：

<Company><Vineel Phatak><Ajay Joshi /><Rajan Gawli /><Sheila Rodrigues /></Vineel Phatak><Ajay Joshi><Abhishek Chauhan /></Ajay Joshi><Abhishek Chauhan><Vijaya Mundada /></Abhishek Chauhan><Jayesh Godse><Shital Tuteja /><Zeba Khan /></Jayesh Godse><Vijaya Mundada /><Shital Tuteja /><Rajan Gawli /><Zeba Khan><Chaitali Sood /></Zeba Khan><Chaitali Sood /><Sheila Rodrigues /><Vineel Phatak /><Jayesh Godse /></Company>

您所看到的并不完全正确。 还删除元素（第18行）不起作用，因为它拒绝添加除ceos以外的子元素

因此，最后，我们需要创建此层次结构，然后打印出最后一位CEO的最后一位雇员的姓名，在本例中为：
上任首席执行官：Jayesh Godse
CEO的最后一位雇员（直接或间接，最后从输入中引入）：Chaitali Sood

Output:
Chaitali Sood

CEO的数目以及其下的子孙子女的数目也不确定，名字也不明确。

我不是ElementTree的新手，所以可能有一些我可能不了解的预定义函数，所以请原谅我的无知。见解和建议深表感谢。预先感谢！

解决方法

在列出我的示例之前，请先对xml结构做一说明：创建xml结构时，最好使用“对象类”作为元素标签，并将其“属性”（如名称和薪水）存储为xml属性：
<employee name="Vineel Phatak" salary="520"/>
代替：
<Vineel Phatak/>
这将使解析变得容易得多，并为扩展格式提供了更大的灵活性。

我的例子

您的问题的示例实现：

import csv
from dataclasses import dataclass
import xml.etree.ElementTree as ET


@dataclass
class Employee:
    linenumber: int
    name: str
    salary: str
    manager_name: str
    subordinates: list


employees = {}  # a dictionary to map names to employees

# load employees
with open('company.csv') as csvfile:
    reader = csv.reader(csvfile)
    for linenumber,row in enumerate(reader):
        (name,salary,manager_name) = [value.strip() for value in row]
        employees[name] = Employee(linenumber,name,manager_name,[])


# link employees to their subordinates
ceos = []
for employee in employees.values():
    if employee.manager_name == 'NOBODY':
        # store the ceos in a list to start building the xml from later
        ceos.append(employee)
    else:
        # look up the manager by it name
        manager = employees[employee.manager_name]
        manager.subordinates.append(employee)

# create xml
companyelement = ET.Element('company')

def add_employees_to_xml_element(xmlelement,employees):
    for employee in employees:
        employee_element = ET.Element("employee",{
            "name": employee.name,"salary": employee.salary
        })
        xmlelement.append(employee_element)
        add_employees_to_xml_element(employee_element,employee.subordinates)


add_employees_to_xml_element(companyelement,ceos)
ET.dump(companyelement)

# find the last entered ceo
def linenumber_key(ceo): return ceo.linenumber


last_entered_ceo = max(ceos,key=linenumber_key)
print(f"Last entered CEO: {last_entered_ceo.name}")

# find the last entered (in)direct subordinate of the last entered ceo
def find_last_entered_subordinate(employee,current_last=None):
    for subordinate in employee.subordinates:
        if not current_last:
            current_last = subordinate  # ensuring an initial value
        else:
            current_last = max([current_last,subordinate],key=linenumber_key)
        # recursive: travers the subordinate's subordinates
        current_last = find_last_entered_subordinate(subordinate,current_last)
    return current_last


last_employee = find_last_entered_subordinate(last_entered_ceo)
print(f"Last added subordinate of last CEO: {last_employee.name}")

我将锻炼分为以下几个部分：

加载员工到字典中，以方便（并加快）以后按名称查找员工的过程。我还存储了每个员工的电话号码，供您以后使用。
将员工链接到其下属。假设管理人员可能在其下属之后列出，则第一步不能合并。每位员工都有其下属的列表，首席执行官存储在单独的“根”列表中。
使用元素树和遍历上述创建的CEO列表的递归函数创建xml 。
查找最后输入的CEO 。我们已经有一个CEO列表，但是由于它是根据字典创建的（不能确保按添加元素的顺序来检索元素），我不能只接受最后一个元素，而应该找到带有CEO的CEO。最大的行号。
查找最后输入的首席执行官的最后输入的（间接）下属。与上面类似，这次我使用了递归函数根据行号来检索此员工。

生成xml：

<company>
    <employee name="Vineel Phatak" salary="520">
        <employee name="Ajay Joshi" salary="250">
            <employee name="Abhishek Chauhan" salary="120">
                <employee name="Vijaya Mundada" salary="60"/>
            </employee>
        </employee>
        <employee name="Rajan Gawli" salary="700"/>
        <employee name="Sheila Rodrigues" salary="35"/>
    </employee>
    <employee name="Jayesh Godse" salary="500">
        <employee name="Shital Tuteja" salary="45"/>
        <employee name="Zeba Khan" salary="300">
            <employee name="Chaitali Sood" salary="100"/>
        </employee>
    </employee>
</company>

在Python中使用ElementTree创建多级层次结构XML树

如何解决在Python中使用ElementTree创建多级层次结构XML树

解决方法

我的例子

相关推荐