如何解决如何将两个groupby合并为一个
我有两个GroubBy:
第一个
ser2 = ser.groupby(pd.cut(ser,10)).sum()
(-2620.137,476638.7] 12393813
(476638.7,951152.4] 9479666
(951152.4,1425666.1] 14381033
(1425666.1,1900179.8] 5113056
(1900179.8,2374693.5] 4114429
(2374693.5,2849207.2] 4929537
(2849207.2,3323720.9] 0
(3323720.9,3798234.6] 0
(3798234.6,4272748.3] 3978230
(4272748.3,4747262.0] 4747262
第二个:
ser1= pd.cut(ser,10)
print(ser1.value_counts())
(-2620.137,476638.7] 110
(476638.7,951152.4] 15
(951152.4,1425666.1] 12
(1425666.1,1900179.8] 3
(2374693.5,2849207.2] 2
(1900179.8,2374693.5] 2
(4272748.3,4747262.0] 1
(3798234.6,4272748.3] 1
(3323720.9,3798234.6] 0
(2849207.2,3323720.9] 0
问题:是否有办法将这些操作组合为一个代码,以将两个计算都存储在同一数据透视表中
解决方法
使用GroupBy.agg
,而不是value_counts
使用GroupBy.size
:
np.random.seed(2020)
ser = pd.Series(np.random.randint(40,size=100))
df = ser.groupby(pd.cut(ser,10)).agg(['sum','size'])
print (df)
sum size
(-0.039,3.9] 27 14
(3.9,7.8] 49 9
(7.8,11.7] 142 15
(11.7,15.6] 151 11
(15.6,19.5] 159 9
(19.5,23.4] 187 9
(23.4,27.3] 253 10
(27.3,31.2] 176 6
(31.2,35.1] 231 7
(35.1,39.0] 375 10
如果需要自定义列名称:
np.random.seed(2020)
ser = pd.Series(np.random.randint(40,10)).agg([('col1','sum'),('col2','size')])
print (df)
col1 col2
(-0.039,3.9] 27 14
(3.9,7.8] 49 9
(7.8,11.7] 142 15
(11.7,15.6] 151 11
(15.6,19.5] 159 9
(19.5,23.4] 187 9
(23.4,27.3] 253 10
(27.3,31.2] 176 6
(31.2,35.1] 231 7
(35.1,39.0] 375 10
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。