如何解决我们可以使用R purrr累积函数来汇总数据集吗?
我试图通过使用purrr::accumulate
函数来总结数据集,方法是通过逐步添加新变量来创建摘要。
因此,我试图通过一次添加一个变量来逐步获得total_years,total_cnt和%。以下是我尝试过的操作,但是出现了未使用的参数错误,并且不确定原因。关于如何执行此操作有任何想法吗?
library(purrr)
library(dplyr)
library(tibble)
data <- tribble(~age_grp,~gender,~eligible,~years,~cnt,'50-60',"F",4,1,'75-80',"M",7,'80+','60-70',2,3,1
)
# `eligible` is binary
vars <- c('gender','eligible','years') %>%
accumulate(function(x,y) paste(x,y,sep = "+"),.init = 'age_grp') %>%
set_names(1:length(.))
enframe(vars,name = 'iteration',value = 'values')
fnct <- function(x) {
data %>%
summarise(
total_years = sum(years),total_cnt = sum(cnt)) %>%
mutate(
percent = total_years/(total_cnt))
}
vars %>%
map(fnct,data = data)
解决方法
不完全是您的方法,但是基本上您可以通过分组来实现。
# error in your function,i think you meant to use x instead of data
fnct <- function(x) {
x %>%
summarise(
total_years = sum(years),total_cnt = sum(cnt)) %>%
mutate(
percent = total_years/(total_cnt))
}
# these seem to be your grouping variables
grp_vars <- c('age_grp','gender','eligible')
# this map will first group on variable 1,then on 1 and 2,and so on
map(seq_along(grp_vars),~ fnct(group_by_at(data,grp_vars[seq.int(.x)])))
这将返回小标题列表。您可以轻松使用bind_rows()
或获取所需的任何列。
[[1]]
# A tibble: 4 x 4
age_grp total_years total_cnt percent
<chr> <dbl> <dbl> <dbl>
1 50-60 5 2 2.5
2 60-70 5 2 2.5
3 75-80 7 1 7
4 80+ 0 1 0
[[2]]
# A tibble: 5 x 5
# Groups: age_grp [4]
age_grp gender total_years total_cnt percent
<chr> <chr> <dbl> <dbl> <dbl>
1 50-60 F 4 1 4
2 50-60 M 1 1 1
3 60-70 F 5 2 2.5
4 75-80 M 7 1 7
5 80+ M 0 1 0
[[3]]
# A tibble: 5 x 6
# Groups: age_grp,gender [5]
age_grp gender eligible total_years total_cnt percent
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 50-60 F 0 4 1 4
2 50-60 M 0 1 1 1
3 60-70 F 1 5 2 2.5
4 75-80 M 0 7 1 7
5 80+ M 1 0 1 0
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。