如何解决重新聚合数据 - 从粗到细的时间分辨率
我想跟进@r2evans 回答的问题:Interpolation in R: retrieving hourly values。我正在尝试将 3 小时的数据重新聚合为每小时。如果我使用以下小型可重现数据集(“tair”):
tair<-structure(list(Year = c(1991L,1991L,1991L),Month = c(1L,1L,1L),DoY = c(1L,2L),Hour = c(3L,6L,9L,12L,15L,18L,21L,0L),Kobb = c(3.032776,3.076996,3.314209,1.760345,1.473724,1.295837,2.72229,3.209503),DateTime = structure(c(662698800,662709600,662720400,662731200,662742000,662752800,662763600,662774400),class = c("POSIXct","POSIXt"),tzone = "UTC")),row.names = c(NA,8L),class = "data.frame")
在以下代码中:
library(zoo)
newdt <- seq.POSIXt(tair$DateTime[1],tail(tair$DateTime,n=1),by='1 hour');newdt
tair_hourly<-data.frame(datetime=newdt,Kobb=approx(tair$DateTime,tair$Kobb,newdt)$y)
它完成了预期的工作,即我成功地将 3 小时的数据插入到每小时中。现在,这适用于温度或辐射等变量。但是,对于降水(随机)等变量,我希望在 3 小时分辨率的每小时汇总数据中保持变量不变(也许将其除以 3)。我只需要每小时的数据,这就是所有这些的原因。
关于如何实现上述小代码的任何想法?
解决方法
两个建议。
基础 R
tair2_list <- lapply(seq_len(nrow(tair) - 1),function(ind) {
times <- seq(tair$DateTime[ind],tair$DateTime[ind+1] - 1,by = "1 hour")
data.frame(
DateTime = times,NewKobb = rep(tair$Kobb[ind] / length(times),length(times)),# for reference only
Kobb = c(tair$Kobb[1],rep(NA,length(times)-1))
)
})
tair2 <- do.call(rbind,tair2_list)
tair2
# DateTime NewKobb Kobb
# 1 1991-01-01 03:00:00 1.0109253 3.032776
# 2 1991-01-01 04:00:00 1.0109253 NA
# 3 1991-01-01 05:00:00 1.0109253 NA
# 4 1991-01-01 06:00:00 1.0256653 3.032776
# 5 1991-01-01 07:00:00 1.0256653 NA
# 6 1991-01-01 08:00:00 1.0256653 NA
# 7 1991-01-01 09:00:00 1.1047363 3.032776
# 8 1991-01-01 10:00:00 1.1047363 NA
# 9 1991-01-01 11:00:00 1.1047363 NA
# 10 1991-01-01 12:00:00 0.5867817 3.032776
# 11 1991-01-01 13:00:00 0.5867817 NA
# 12 1991-01-01 14:00:00 0.5867817 NA
# 13 1991-01-01 15:00:00 0.4912413 3.032776
# 14 1991-01-01 16:00:00 0.4912413 NA
# 15 1991-01-01 17:00:00 0.4912413 NA
# 16 1991-01-01 18:00:00 0.4319457 3.032776
# 17 1991-01-01 19:00:00 0.4319457 NA
# 18 1991-01-01 20:00:00 0.4319457 NA
# 19 1991-01-01 21:00:00 0.9074300 3.032776
# 20 1991-01-01 22:00:00 0.9074300 NA
# 21 1991-01-01 23:00:00 0.9074300 NA
tair$DateTime[ind+1] - 1
是为了确保我们不会无意中保留新序列中的最后一个。
tidyverse
library(dplyr)
library(purrr)
library(tidyr)
tair %>%
mutate(DateTime2 = purrr::map2(DateTime,lead(DateTime - 1,default = last(DateTime)),~ tibble(DateTime2 = seq(.x,.y,by = "1 hour"))) ) %>%
unnest(DateTime2) %>%
group_by(DateTime) %>%
mutate(NewKobb = Kobb / n()) %>%
ungroup()
# # A tibble: 22 x 8
# Year Month DoY Hour Kobb DateTime DateTime2 NewKobb
# <int> <int> <int> <int> <dbl> <dttm> <dttm> <dbl>
# 1 1991 1 1 3 3.03 1991-01-01 03:00:00 1991-01-01 03:00:00 1.01
# 2 1991 1 1 3 3.03 1991-01-01 03:00:00 1991-01-01 04:00:00 1.01
# 3 1991 1 1 3 3.03 1991-01-01 03:00:00 1991-01-01 05:00:00 1.01
# 4 1991 1 1 6 3.08 1991-01-01 06:00:00 1991-01-01 06:00:00 1.03
# 5 1991 1 1 6 3.08 1991-01-01 06:00:00 1991-01-01 07:00:00 1.03
# 6 1991 1 1 6 3.08 1991-01-01 06:00:00 1991-01-01 08:00:00 1.03
# 7 1991 1 1 9 3.31 1991-01-01 09:00:00 1991-01-01 09:00:00 1.10
# 8 1991 1 1 9 3.31 1991-01-01 09:00:00 1991-01-01 10:00:00 1.10
# 9 1991 1 1 9 3.31 1991-01-01 09:00:00 1991-01-01 11:00:00 1.10
# 10 1991 1 1 12 1.76 1991-01-01 12:00:00 1991-01-01 12:00:00 0.587
# # ... with 12 more rows
(我觉得有更好的方法可以做到这一点...)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。