如何解决将列名称传递到R dplyr group_by并汇总函数
我正在尝试编写一个函数,该函数采用一个数据帧和一个变量名(或变量名列表),并使用group_by和summary函数输出摘要信息。但是,我不断收到以下错误:
Error: Problem with `mutate()` input `..1`.
x Input `..1` must be a vector,not a function.
i Input `..1` is `<fn>`.
或此错误:
Error in (function (x) : object 'ym' not found
最后一个错误是它找不到名为“值”的列,该列包含数据框的值(在融化之后)。
这是我的代码:
tested <- melt(test_data,measure.vars = c('TA','PP','US','UD','UE','UG','UH','XR','RW','PA','TB4','TV2','TV4','TV8','TV20','TV40','MV2','MV4','MV8','MV20','MV40','VB'),id.vars = c('TmStamp','year','month','ym','day','hour'))
test_function <- function(data,col){
stats <- data %>% group_by(!!col,variable) %>%
summarize(N = length(value[!is.na(value)]),Missing = length(value[is.na(value)]),Per.Avail = (length(value[!is.na(value)])/(length(value[!is.na(value)]) + length(value[is.na(value)]))) * 100,Mean = mean(value,na.rm=TRUE),Median = median(value,Min = min(value,Max = max(value,Range = max(value,na.rm=TRUE) - min(value,Variance = var(value,Std.Dev = sd(value,Coef.Var = sd(value,na.rm=TRUE)/mean(value,SE = sd(value,na.rm=TRUE)/sqrt(length(value[!is.na(value)])),Skewness = e1071::skewness(value,Kurtosis = e1071::kurtosis(value,IQR = IQR(value,MAD = mad(value,na.rm=TRUE)
)
return(stats)
}
test_function(tested,ym)
这是一个小数据样本。请注意,“变量”是一列,始终会传递到group_by函数中,因此我决定对其进行硬编码。
structure(list(TmStamp = c("2019-10-01 12:00:00 AM","2019-10-01 12:05:00 AM","2019-10-01 12:10:00 AM","2019-10-01 12:15:00 AM","2019-10-01 12:20:00 AM","2019-10-01 12:25:00 AM","2019-10-01 12:30:00 AM","2019-10-01 12:35:00 AM","2019-10-01 12:40:00 AM","2019-10-01 12:45:00 AM","2019-10-01 12:50:00 AM","2019-10-01 12:55:00 AM","2019-10-01 01:00:00 AM","2019-10-01 01:05:00 AM","2019-10-01 01:10:00 AM","2019-10-01 01:15:00 AM","2019-10-01 01:20:00 AM","2019-10-01 01:25:00 AM","2019-10-01 01:30:00 AM","2019-10-01 01:35:00 AM"
),year = c(2019,2019,2019
),month = c(10,10,10),ym = c("10-2019","10-2019","10-2019"
),day = structure(c(18170,18170,18170),class = "Date"),hour = c(23L,0L,1L,1L),variable = structure(c(1L,.Label = c("TA","PP","US","UD","UE","UG","UH","XR","RW","PA","TB4","TV2","TV4","TV8","TV20","TV40","MV2","MV4","MV8","MV20","MV40","VB"),class = "factor"),value = c(6.008,6.013,5.915,5.777,5.727,5.679,5.653,5.591,5.479,5.353,5.299,5.249,5.256,5.171,5.01,4.901,4.716,4.487,4.397,4.25)),row.names = c(NA,20L),class = "data.frame")
我该如何编写此函数以便它在group_by函数中接受一个或多个列名?
解决方法
要使您的功能正常工作,请使用例如An unhandled exception occurred: NGCC failed
,而不是{{col}}
。要使您的函数适用于多个var,可以使用!!col
表示法,也可以将符号传递给group_by:
...
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。