微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

python – Pandas数据帧:按A分组,B取nlargest,输出C.

根据B中的值,每个A的前两个C值是多少?

    df = pd.DataFrame({
            'A': ["first","second","second","first",
                        "second","first","third","fourth",
                        "fifth","second","fifth","first",
                        "first","second","third","fourth","fifth"],
            'B': [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
            'C': ["a", "b", "c", "d",
                     "e", "f", "g", "h",
                     "i", "j", "k", "l",
                     "m", "n", "o", "p", "q"]})

我在尝试

    x = df.groupby(['A'])['B'].nlargest(2)

    A
    fifth   16    7
            10    4
    first   12    6
            11    5
    fourth  15    7
            7     3
    second  13    6
            9     4
    third   14    6
            6     3

但这会丢弃C列,这就是我需要的实际值.

我想在结果中使用C,而不是原始df的行索引.我必须加入吗?我甚至只拿一个C列表……

我需要对每个A的前2个C值(基于B)采取行动.

解决方法:

IIUC:

In [42]: df.groupby(['A'])['B','C'].apply(lambda x: x.nlargest(2, columns=['B'])
Out[42]:
           B  C
A
fifth  16  7  q
       10  4  k
first  12  6  m
       11  5  l
fourth 15  7  p
       7   3  h
second 13  6  n
       9   4  j
third  14  6  o
       6   3  g

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐