如何解决创建索引的子列表,每个子列表引用元组列表中的唯一元组集
我试图通过对元组的索引进行分组来创建索引的子列表,其中元组列表中的任何元素都是常见的,或者将唯一的元组索引分开。唯一元组的定义(不是元组的元素)与列表中其他元组相同位置的元素相同。 示例:列出将同一公司分组在一起的列表,其中同一公司定义为相同的名称或注册编号或首席执行官的名称。
company_list = [("companyA",0002,"ceoX"),("companyB"),"ceoY"),("companyC",0003,("companyD",004,"ceoZ")]
所需的输出将是:
[[0,1,2],[3]]
有人知道这个问题的解决方案吗?
解决方法
公司构成图表。您想从关联公司创建集群。
尝试一下:
company_list = [
("companyA",2,"ceoX"),("companyB","ceoY"),("companyC",3,("companyD",4,"ceoZ")
]
# Prepare indexes
by_name = {}
by_number = {}
by_ceo = {}
for i,t in enumerate(company_list):
if t[0] not in by_name:
by_name[t[0]] = []
by_name[t[0]].append(i)
if t[1] not in by_number:
by_number[t[1]] = []
by_number[t[1]].append(i)
if t[2] not in by_ceo:
by_ceo[t[2]] = []
by_ceo[t[2]].append(i)
# BFS to propagate group to connected companies
groups = list(range(len(company_list)))
for i in range(len(company_list)):
g = groups[i]
queue = [g]
while queue:
x = queue.pop(0)
groups[x] = g
t = company_list[x]
for y in by_name[t[0]]:
if g < groups[y]:
queue.append(y)
for y in by_number[t[1]]:
if g < groups[y]:
queue.append(y)
for y in by_ceo[t[2]]:
if g < groups[y]:
queue.append(y)
# Assemble result
result = []
current = None
last = None
for i,g in enumerate(groups):
if g != last:
if current:
result.append(current)
current = []
last = g
current.append(i)
if current:
result.append(current)
print(result)
,
Fafl的回答肯定更出色。如果您不担心性能,这里有一个蛮力的解决方案,可能更易于阅读。试图通过一些注释使其清楚。
def find_index(res,target_index):
for index,sublist in enumerate(res):
if target_index in sublist:
# yes,it's present
return index
return None # not present
def main():
company_list = [
('companyA','0002','CEOX'),('companyB','CEOY'),('companyC','0003',('companyD','0004','CEOZ'),('companyE','CEOM'),]
res = []
for index,company_detail in enumerate(company_list):
# check if this `index` is already present in a sublist in `res`
# if the `index` is already present in a sublist in `res`,then we need to add to that sublist
# otherwise we will start a new sublist in `res`
index_to_add_to = None
if find_index(res,index) is None:
# does not exist
res.append([index])
index_to_add_to = len(res) - 1
else:
# exists
index_to_add_to = find_index(res,index)
for c_index,c_company_detail in enumerate(company_list):
# inner loop to compare company details with the other loop
if c_index == index:
# same,ignore
continue
if company_detail[0] == c_company_detail[0] or company_detail[1] == c_company_detail[1] or company_detail[2] == c_company_detail[2]:
# something matches,so append
res[index_to_add_to].append(c_index)
res[index_to_add_to] = list(set(res[index_to_add_to])) # make it unique
print(res)
if __name__ == '__main__':
main()
,
检查一下,我为此做了很多尝试。可能是我缺少一些测试用例。性能方面,我认为它很好。
我用过set()
并弹出了一组。
company_list = [
("companyA","ceoZ"),"ceoW")
]
index = {val: key for key,val in enumerate(company_list)}
res = []
while len(company_list):
new_idx = 0
temp = []
val = company_list.pop(new_idx)
temp.append(index[val])
while new_idx < len(company_list) :
if len(set(val + company_list[new_idx])) < 6:
temp.append(index[company_list.pop(new_idx)])
else:
new_idx += 1
res.append(temp)
print(res)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。