如何解决如何计算不同时间间隔内有多少个孩子?
我有一个数据集,其中包含幼儿园的孩子们什么时候送达和接送的信息。我想计算一下30分钟内有多少个孩子。因此,即有7.30-7.59 AM或16.00-16.29 PM。
在我的数据集中,有18个孩子,但这只是其中的一小部分。 问题是我在kommet_antal和afhentet_antal列表中的每个条目上都得到18的总和。
在底部,我写了一些有效的代码,但实际上并不是那么漂亮!
[1] "08:09:00" "09:00:00" "07:37:00" "07:51:00"
lst_afhentet
[1] "15:38:00" "15:19:00" "15:56:00" "14:24:00"
W1M_antal <- for(i in 1:nrow(df_W1M)){ #W1m is the dataframe with week1 and mondays selected
lst_kommet <- df_W1M$Kommet
lst_afhentet <- df_W1M$Afhentet
kommet_antal <- vector("list",21) #21 timeintervals,open from 6.30 to 17.00
afhentet_antal <- vector("list",21)
tid<- as.ITime("07:00") #initial start time,T1
n <- 0
for(k in 1:length(lst_kommet)){ #runs over when children is delivered
if(k < tid){ #If the time delivered is before time (tid),then count it
n <- n + 1
} else n <- n
for(t in 1:length(kommet_antal)){ #want to save number of kids delivered in the different intervals
kommet_antal[t] <- n
}
tid = tid + as.ITime("00:30") #add 30 min to the time so we have next time interval
}
tid <- as.ITime("07:00") #Do the same for pick up
m <- 0
for(a in 1:length(lst_afhentet)){
if(a < tid){
m <- m + 1
} else m <- m
for(l in 1:length(afhentet_antal)){ #Save number of kids in intervals until they are picked up
afhentet_antal[l] <- m
}
tid <- tid + as.ITime("00:30")
}
}
tid
total_antal <- vector("list",21)
total_antal <- as.numeric(kommet_antal) - as.numeric(afhentet_antal)
total_antal
This code works,and give me the correct number,but with datasets from a year with 5 days per week it is going to take a long time to count number of kids present.
T1 <- count(subset(Mandag,Kommet < "07:00")) - count(subset(Mandag,Afhentet <"07:00"))
T2 <- count(subset(Mandag,Kommet < "07:30")) - count(subset(Mandag,Afhentet <"07:30"))
T3 <- count(subset(Mandag,Kommet < "08:00")) - count(subset(Mandag,Afhentet <"08:00"))
T4 <- count(subset(Mandag,Kommet < "08:30")) - count(subset(Mandag,Afhentet <"08:30"))
T5 <- count(subset(Mandag,Kommet < "09:00")) - count(subset(Mandag,Afhentet <"09:00"))
T6 <- count(subset(Mandag,Kommet < "09:30")) - count(subset(Mandag,Afhentet <"09:30"))
T7 <- count(subset(Mandag,Kommet < "10:00")) - count(subset(Mandag,Afhentet <"10:00"))
T8 <- count(subset(Mandag,Kommet < "10:30")) - count(subset(Mandag,Afhentet <"10:30"))
T9 <- count(subset(Mandag,Kommet < "11:00")) - count(subset(Mandag,Afhentet <"11:00"))
T10 <- count(subset(Mandag,Kommet < "11:30")) - count(subset(Mandag,Afhentet <"11:30"))
T11 <- count(subset(Mandag,Kommet < "12:00")) - count(subset(Mandag,Afhentet <"12:00"))
T12 <- count(subset(Mandag,Kommet < "12:30")) - count(subset(Mandag,Afhentet <"12:30"))
T13 <- count(subset(Mandag,Kommet < "13:00")) - count(subset(Mandag,Afhentet <"13:00"))
T14 <- count(subset(Mandag,Kommet < "13:30")) - count(subset(Mandag,Afhentet <"13:30"))
T15 <- count(subset(Mandag,Kommet < "14:00")) - count(subset(Mandag,Afhentet <"14:00"))
T16 <- count(subset(Mandag,Kommet < "14:30")) - count(subset(Mandag,Afhentet <"14:30"))
T17 <- count(subset(Mandag,Kommet < "15:00")) - count(subset(Mandag,Afhentet <"15:00"))
T18 <- count(subset(Mandag,Kommet < "15:30")) - count(subset(Mandag,Afhentet <"15:30"))
T19 <- count(subset(Mandag,Kommet < "16:00")) - count(subset(Mandag,Afhentet <"16:00"))
T20 <- count(subset(Mandag,Kommet < "16:30")) - count(subset(Mandag,Afhentet <"16:30"))
T21 <- count(subset(Mandag,Kommet < "17:00")) - count(subset(Mandag,Afhentet <"17:00"))
#Laver output i dataframe
W <- c(rep("Week1",22*5),rep("Week2",rep("Week3",rep("Week4",22*5))
D <- c(rep("Monday",22*4),rep("Tuesday",rep("Wednesday",rep("Thursday",rep("Friday",22*4))
Time <- c(rep(1:22,20))
Value1 <- c(T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,rep(0,419))
Value <- do.call(rbind,Value1)
Output <- data.frame(W,D,Time,Value)
View(Output) ```
解决方法
lubridate
软件包提供了一些有用的功能来解决此类问题。
您没有可复制的示例,因此我的帮助必须是通用的,您可以将原则应用于您的情况。
设置上下文:日期和时间在编程和R中也非常复杂。您正在检查时间是否落在两个端点之间,以进行接送。函数%within%
负责这种类型的操作。一个示例是time_check %within% dropoff_pickup_intervals
。当时间检查假设07:30
介于学生的接送时间之间时,它将返回TRUE。
但是我们首先需要正确的日期和间隔格式。这是一些示例代码:(请注意,最初的大多数代码都是为了帮助创建示例,功能的实质是较低的)
library(lubridate)
# Make example data
# ----
# set seed for repeatable results for random processes
set.seed(20201023)
# set a time for start and ending for dropoffs and pickups using "ymd_hm" lubridate function
daystart <- ymd_hm("2020-01-01 07:00")
dayend <- ymd_hm("2020-01-01 16:00")
# Create a sequence of dates
time_set <- seq(daystart,dayend,by="min")
# data frame sampling times from
kinder <- data.frame(student_id=1:10,dropoff=sample(time_set,10),pickup=sample(time_set,10))
# remove dates that don't make sense
kinder <- kinder %>% filter(pickup > dropoff)
# --- Example data complete
# create time intervals for student arrival and leave times
dropoff_pickup <- interval(kinder$dropoff,kinder$pickup)
# Create sequence to check times every 30 minutes
time_checks <- seq(daystart,by="30 min")
# for every student check whether present at time checks
student_present <- sapply(k$intervals,function(x) time_checks %within% x)
# (Bonus: Make into a nice looking data frame)
df1 <- as.data.frame(t(student_present))
names(df1) <- substr(as.character(time_checks),12,16)
df1 <- cbind(k$id,df1)
df1
有关%within% function和lubridate的更多信息
,谢谢 我可以运行您的示例并且它可以正常工作,然后我想可以运行一个count函数来计算在不同时间间隔内出现的孩子的实际数量。
但是我无法在我的数据上运行它,我试图更改它,但是当我想使用间隔函数时,我遇到了麻烦。即使我的变量是字符类型。该文档说应该可以正常工作-如果我理解正确的话。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。