如何解决从两列中查找最大的一对,同时保持数据框完整
我有一个数据框,我想根据两列查找最大的对。但是,当我对数据框进行分组时,其他列上的细微变化都会影响我的结果。
让我告诉你:
library(plyr)
usercsv_data <- data.frame(id_str = c("89797","12387231231","1234823432","3483487344","89797","1234823432"),screen_name = c("A","B","C","D","A","C"),location = c("FL","CO","NYC","MI","FL","NYC"),verified = c("Y","N","Y","Y"),created = c("Sun","Mon","Tue","Sun","Fri"),friends_count = c(1,2,5,787,7,5),followers_count= c(2,4,6,897,3))
# id_str screen_name location verified created friends_count followers_count
# 1 89797 A FL Y Sun 1 2
# 2 12387231231 B CO N Mon 2 4
# 3 1234823432 C NYC N Tue 5 6
# 4 3483487344 D MI Y Sun 787 897
# 5 89797 A FL N Tue 7 4
# 6 1234823432 C NYC Y Fri 5 3
#This gets me the max pairs when the groups variable are unique
plyr::ddply(usercsv_data,.(id_str,screen_name),numcolwise(max))
# id_str screen_name friends_count followers_count
# 1 1234823432 C 5 6
# 2 12387231231 B 2 4
# 3 3483487344 D 787 897
# 4 89797 A 7 4
#BUT,when I want to do same technique with whole dataframe,I get same dataframe
plyr::ddply(usercsv_data,screen_name,location,verified,created),numcolwise(max))
# id_str screen_name location verified created friends_count followers_count
# 1 1234823432 C NYC N Tue 5 6
# 2 1234823432 C NYC Y Fri 5 3
# 3 12387231231 B CO N Mon 2 4
# 4 3483487344 D MI Y Sun 787 897
# 5 89797 A FL N Tue 7 4
# 6 89797 A FL Y Sun 1 2
但是我想要这样的东西-
# id_str screen_name location verified created friends_count followers_count
# 1 1234823432 C NYC N Tue 5 6
# 3 12387231231 B CO N Mon 2 4
# 4 3483487344 D MI Y Sun 787 897
# 5 89797 A FL N Tue 7 4
如何分组以便维护所有列,但仅保留存在最大对的行?当前,当组变量更多时,它会保留唯一的变量(应该是这样),但由于知识不足,我也无法搜索问题。
解决方法
plyr
已停用,因此我们可以在此处使用dplyr
,方法是创建一个列,该列的总和为friends_count
和followers_count
,然后为每个{{ 1}}和id_str
。
screen_name
或者不创建library(dplyr)
usercsv_data %>%
mutate(max = rowSums(select(.,friends_count,followers_count))) %>%
group_by(id_str,screen_name) %>%
slice(which.max(max))
列。
max
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。