我们可以使用R purrr累积函数来汇总数据集吗？

如何解决我们可以使用R purrr累积函数来汇总数据集吗？

我试图通过使用purrr::accumulate函数来总结数据集，方法是通过逐步添加新变量来创建摘要。

因此，我试图通过一次添加一个变量来逐步获得total_years，total_cnt和％。以下是我尝试过的操作，但是出现了未使用的参数错误，并且不确定原因。关于如何执行此操作有任何想法吗？

library(purrr)
library(dplyr)
library(tibble)


data <- tribble(~age_grp,~gender,~eligible,~years,~cnt,'50-60',"F",4,1,'75-80',"M",7,'80+','60-70',2,3,1
       )

#  `eligible` is binary

vars <- c('gender','eligible','years') %>% 
        accumulate(function(x,y) paste(x,y,sep = "+"),.init = 'age_grp') %>% 
        set_names(1:length(.))
enframe(vars,name = 'iteration',value = 'values')

fnct <- function(x) {
        data %>% 
        summarise(
                total_years     = sum(years),total_cnt       = sum(cnt)) %>% 
        mutate(
                percent         = total_years/(total_cnt))
}

vars %>% 
        map(fnct,data = data)

解决方法

不完全是您的方法，但是基本上您可以通过分组来实现。

# error in your function,i think you meant to use x instead of data
fnct <- function(x) {
  x %>%
    summarise(
      total_years     = sum(years),total_cnt       = sum(cnt)) %>% 
    mutate(
      percent         = total_years/(total_cnt))
}

# these seem to be your grouping variables
grp_vars <- c('age_grp','gender','eligible')

# this map will first group on variable 1,then on 1 and 2,and so on
map(seq_along(grp_vars),~ fnct(group_by_at(data,grp_vars[seq.int(.x)])))

这将返回小标题列表。您可以轻松使用bind_rows()或获取所需的任何列。

[[1]]
# A tibble: 4 x 4
  age_grp total_years total_cnt percent
  <chr>         <dbl>     <dbl>   <dbl>
1 50-60             5         2     2.5
2 60-70             5         2     2.5
3 75-80             7         1     7  
4 80+               0         1     0  

[[2]]
# A tibble: 5 x 5
# Groups:   age_grp [4]
  age_grp gender total_years total_cnt percent
  <chr>   <chr>        <dbl>     <dbl>   <dbl>
1 50-60   F                4         1     4  
2 50-60   M                1         1     1  
3 60-70   F                5         2     2.5
4 75-80   M                7         1     7  
5 80+     M                0         1     0  

[[3]]
# A tibble: 5 x 6
# Groups:   age_grp,gender [5]
  age_grp gender eligible total_years total_cnt percent
  <chr>   <chr>     <dbl>       <dbl>     <dbl>   <dbl>
1 50-60   F             0           4         1     4  
2 50-60   M             0           1         1     1  
3 60-70   F             1           5         2     2.5
4 75-80   M             0           7         1     7  
5 80+     M             1           0         1     0

我们可以使用R purrr累积函数来汇总数据集吗？

如何解决我们可以使用R purrr累积函数来汇总数据集吗？

解决方法

相关推荐