R中扫描的日期减法

如何解决R中扫描的日期减法

我有一个看起来像这样的数据向量：

还有带有日期列的数据框，如下所示：

此数据框中的行数等于new_first_dates的长度。

我想从数据帧的每一行中减去new_first_dates。我尝试使用扫描，但无法正常工作。

有人有什么想法吗？

欢呼

随附的数据：

structure(list(`>50 gatherings` = structure(c(18346,18342,18339,18341,18346,18355,18340,18338,18349,18343,18347,18345,18337,18351,21915,18339),class = "Date"),`>500 gatherings` = structure(c(18346,18335,18334,18336,`public schools` = structure(c(18341,18340),`restaurant dine-in` = structure(c(18339,18354,`entertainment/gym` = structure(c(18339,18350,18359,18344,`federal guidelines` = structure(c(18338,18338),`foreign travel ban` = structure(c(18333,18333,18333),`stay at home rollback` = structure(c(18377,18383,18398,18392,18379,18432,18413,18387,18415,18380,18403,18391,18424,18419,18409,18396,18396),`>50 gatherings rollback` = structure(c(18391,18426,18405,18429,18404,18384,21915),`>500 gatherings rollback` = structure(c(18405,`restaurant dine-in rollback` = structure(c(18405,18394,18390,18411,21915    ),`entertainment/gym rollback` = structure(c(18405,18377,State = c("AK","AL","AZ","CA","CO","FL","GA","IA","IL","IN","KS","MD","MI","MN","MO","MS","NC","NM","OH","OK","OR","PA","RI","SD","TN","TX","UT","VA","WI"),cluster = c(3,1,2,3,1)),row.names = c(NA,-29L),class = c("grouped_df","tbl_df","tbl","data.frame"),groups = structure(list(State = c("AK",.rows = list(1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,11L,12L,13L,14L,15L,16L,17L,18L,19L,20L,21L,22L,23L,24L,25L,26L,27L,28L,29L)),class = c("tbl_df",.drop = TRUE))

structure(c(18338,18325,18332,18337),class = "Date")

解决方法

某些列不是Date类。因此，在进行减法之前，我们需要排除那些。另外，根据OP帖子的结构，数据集中有一些属性，可以通过ungroup ing

删除它们

library(dplyr) # version >= 1.0.0
df1 %>%
  ungroup %>%
  mutate(across(-c(State,cluster),~ . - as.Date(v1)))

-输出

# A tibble: 29 x 14
#   `>50 gatherings` `>500 gathering… `public schools` `restaurant din… `entertainment/… `federal guidel… `foreign travel… `stay at home r…
#   <drtn>           <drtn>           <drtn>           <drtn>           <drtn>           <drtn>           <drtn>           <drtn>          
# 1  8 days            8 days          3 days           1 days           1 days           0 days           -5 days           39 days       
# 2 -3 days          -10 days         -7 days          -4 days           5 days          -7 days          -12 days           38 days       
# 3 -1 days           -1 days         -2 days           1 days           1 days          -2 days           -7 days           58 days       
# 4 16 days           16 days         16 days          12 days          12 days          13 days            8 days           67 days       
# 5 14 days            3 days         13 days          10 days           7 days           6 days            1 days           47 days       
# 6  5 days            5 days          4 days           8 days           8 days           4 days           -1 days           98 days       
# 7  6 days            6 days          7 days           9 days           9 days           5 days            0 days           50 days       
# 8 22 days           22 days         22 days           6 days           6 days           5 days            0 days         3582 days       
# 9  1 days            1 days          0 days           1 days           1 days          -1 days           -6 days           74 days       
#10  7 days           -4 days          2 days           0 days           0 days          -1 days           -6 days           48 days       
# … with 19 more rows,and 6 more variables: `>50 gatherings rollback` <drtn>,`>500 gatherings rollback` <drtn>,`restaurant dine-in
#   rollback` <drtn>,`entertainment/gym rollback` <drtn>,State <chr>,cluster <dbl>

或者如果OP偏爱sweep

df1[1:12] <- sweep(as.data.frame(df1[1:12]),1,as.Date(v1),'-')

或使用lapply中的base R

df1[1:12] <- lapply(df1[1:12],`-`,as.Date(v1))

如果数据集很大，则可能需要使用data.table包进行就地修改语义研究。

library( data.table )
dates_dataset = data.table( dates_dataset )

# Find the columns that are date class or otherwise select the columns you want to subtract
date_columns = names(dates_dataset)[ sapply( dates_dataset,class ) == "Date" ]

# Use the := in-place assignment operator to update the columns
dates_dataset[,eval(date_columns) := {
    lapply( date_columns,function(x){
        get( x ) - new_first_dates
    } )
} ]

如果要创建新列，可以选择将eval(date_columns)更改为其他内容。

输出为：

get_vars = c(">50 gatherings",">500 gatherings")
head( dates_dataset[,mget(get_vars) ] )

   >50 gatherings >500 gatherings
1:         8 days          8 days
2:        -3 days        -10 days
3:        -1 days         -1 days
4:        16 days         16 days
5:        14 days          3 days
6:         5 days          5 days

R中扫描的日期减法

如何解决R中扫描的日期减法

解决方法

相关推荐