如何解决计算具有特定列名称的列中的值
Subs_1718 Count_1718 Subs_1819 Count_1819 Subs_1920 Count_1920
Apple 10.0 Grapes 12 Banana 12.0
Grapes 2.0 Apple 6 Grapes 8.0
Banana 2.0 Pineapple 3 Cashew 1.0
Dragonfruit 1.0 Banana 2 Apple 1.0
Kiwi 1.0 Kiwi 2 Melon 1.0
Melon 1.0 Cashew 1 Grapes 1.0
如何创建一个新列,该新列将对df['Count_1718']
的列的值进行计数,
df['Count_1819']
,df['Count_1920']
?
预期输出:
Subs_1720 Count_1720
Apple 17
Banana 16
Cashew 2
Dragonfruit 1
Grapes 22
Melon 2
Pineapple 1
解决方法
您可以在此处使用pd.wide_to_long
,并用groupby.sum
指定相应的存根名称:
(pd.wide_to_long(df.reset_index(),stubnames=['Subs','Count'],i='index',j='ix',suffix= '_\d+')
.groupby('Subs').sum())
Count
Subs
Apple 17.0
Banana 16.0
Cashew 2.0
Dragonfruit 1.0
Grapes 23.0
Kiwi 3.0
Melon 2.0
Pineapple 3.0
,
将wide_to_long
与汇总sum
一起使用:
df1 = (pd.wide_to_long(df.reset_index(),sep='_',j='d')
.groupby('Subs')['Count']
.sum()
.rename_axis('Subs_1720')
.reset_index(name='Count_1720'))
print (df1)
Subs_1720 Count_1720
0 Apple 17.0
1 Banana 16.0
2 Cashew 2.0
3 Dragonfruit 1.0
4 Grapes 23.0
5 Kiwi 3.0
6 Melon 2.0
7 Pineapple 3.0
,
您可以将列转换为多索引,堆叠并进行分组:
df.columns = df.columns.str.split("_",expand=True)
(df.stack()
.groupby("Subs")
.sum()
.reset_index()
.set_axis(["Subs_1720","Count_1720"],axis=1))
Subs_1720 Count_1720
0 Apple 17.0
1 Banana 16.0
2 Cashew 2.0
3 Dragonfruit 1.0
4 Grapes 23.0
5 Kiwi 3.0
6 Melon 2.0
7 Pineapple 3.0
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。