如何解决R中扫描的日期减法
我有一个看起来像这样的数据向量:
还有带有日期列的数据框,如下所示:
此数据框中的行数等于new_first_dates的长度。
我想从数据帧的每一行中减去new_first_dates。我尝试使用扫描,但无法正常工作。
有人有什么想法吗?
欢呼
随附的数据:
structure(list(`>50 gatherings` = structure(c(18346,18342,18339,18341,18346,18355,18340,18338,18349,18343,18347,18345,18337,18351,21915,18339),class = "Date"),`>500 gatherings` = structure(c(18346,18335,18334,18336,`public schools` = structure(c(18341,18340),`restaurant dine-in` = structure(c(18339,18354,`entertainment/gym` = structure(c(18339,18350,18359,18344,`federal guidelines` = structure(c(18338,18338),`foreign travel ban` = structure(c(18333,18333,18333),`stay at home rollback` = structure(c(18377,18383,18398,18392,18379,18432,18413,18387,18415,18380,18403,18391,18424,18419,18409,18396,18396),`>50 gatherings rollback` = structure(c(18391,18426,18405,18429,18404,18384,21915),`>500 gatherings rollback` = structure(c(18405,`restaurant dine-in rollback` = structure(c(18405,18394,18390,18411,21915 ),`entertainment/gym rollback` = structure(c(18405,18377,State = c("AK","AL","AZ","CA","CO","FL","GA","IA","IL","IN","KS","MD","MI","MN","MO","MS","NC","NM","OH","OK","OR","PA","RI","SD","TN","TX","UT","VA","WI"),cluster = c(3,1,2,3,1)),row.names = c(NA,-29L),class = c("grouped_df","tbl_df","tbl","data.frame"),groups = structure(list(State = c("AK",.rows = list(1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,11L,12L,13L,14L,15L,16L,17L,18L,19L,20L,21L,22L,23L,24L,25L,26L,27L,28L,29L)),class = c("tbl_df",.drop = TRUE))
structure(c(18338,18325,18332,18337),class = "Date")
解决方法
某些列不是Date
类。因此,在进行减法之前,我们需要排除那些。另外,根据OP帖子的结构,数据集中有一些属性,可以通过ungroup
ing
library(dplyr) # version >= 1.0.0
df1 %>%
ungroup %>%
mutate(across(-c(State,cluster),~ . - as.Date(v1)))
-输出
# A tibble: 29 x 14
# `>50 gatherings` `>500 gathering… `public schools` `restaurant din… `entertainment/… `federal guidel… `foreign travel… `stay at home r…
# <drtn> <drtn> <drtn> <drtn> <drtn> <drtn> <drtn> <drtn>
# 1 8 days 8 days 3 days 1 days 1 days 0 days -5 days 39 days
# 2 -3 days -10 days -7 days -4 days 5 days -7 days -12 days 38 days
# 3 -1 days -1 days -2 days 1 days 1 days -2 days -7 days 58 days
# 4 16 days 16 days 16 days 12 days 12 days 13 days 8 days 67 days
# 5 14 days 3 days 13 days 10 days 7 days 6 days 1 days 47 days
# 6 5 days 5 days 4 days 8 days 8 days 4 days -1 days 98 days
# 7 6 days 6 days 7 days 9 days 9 days 5 days 0 days 50 days
# 8 22 days 22 days 22 days 6 days 6 days 5 days 0 days 3582 days
# 9 1 days 1 days 0 days 1 days 1 days -1 days -6 days 74 days
#10 7 days -4 days 2 days 0 days 0 days -1 days -6 days 48 days
# … with 19 more rows,and 6 more variables: `>50 gatherings rollback` <drtn>,`>500 gatherings rollback` <drtn>,`restaurant dine-in
# rollback` <drtn>,`entertainment/gym rollback` <drtn>,State <chr>,cluster <dbl>
或者如果OP偏爱sweep
df1[1:12] <- sweep(as.data.frame(df1[1:12]),1,as.Date(v1),'-')
或使用lapply
中的base R
df1[1:12] <- lapply(df1[1:12],`-`,as.Date(v1))
,
如果数据集很大,则可能需要使用data.table
包进行就地修改语义研究。
library( data.table )
dates_dataset = data.table( dates_dataset )
# Find the columns that are date class or otherwise select the columns you want to subtract
date_columns = names(dates_dataset)[ sapply( dates_dataset,class ) == "Date" ]
# Use the := in-place assignment operator to update the columns
dates_dataset[,eval(date_columns) := {
lapply( date_columns,function(x){
get( x ) - new_first_dates
} )
} ]
如果要创建新列,可以选择将eval(date_columns)
更改为其他内容。
输出为:
get_vars = c(">50 gatherings",">500 gatherings")
head( dates_dataset[,mget(get_vars) ] )
>50 gatherings >500 gatherings
1: 8 days 8 days
2: -3 days -10 days
3: -1 days -1 days
4: 16 days 16 days
5: 14 days 3 days
6: 5 days 5 days
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。