如何解决如何有效地检查R中的特定值并标记包含该值的行中的变量? 数据
我想创建一个变量,以标记多个变量中的一个或多个是否具有特定值。
week Mon Tues Weds Thurs Fri Sat
1 jon jon jon jon mary mary
2 jane jane jane jane jane jane
3 mary mary mary mary mary jane
我想创建一个二进制变量,用于每周标记该周的Mon,Wed或Sat ==“ jon”或“ mary”是否有办法做到这一点而无需创建长的ifelse语句来检查每个变量单独变量?
week Mon Tues Weds Thurs Fri Sat flag
1 jon jon jon jon mary mary 1
2 jane jane jane jane jane jane 0
3 mary mary mary mary mary jane 1
我尝试过
df %>%
rowwise() %>%
mutate(flag = +any(c_across(Mon,Weds,Sat)
%in% ("jon","mary")) %>%
ungroup()
但是我得到一个错误
Error: Problem with `mutate()` input `flag`.
x unused arguments (Mon,Sat)
i Input `flag` is `+...`.
i The error occurred in row 1.
解决方法
df %>%
mutate(flag = colSums(apply(cbind(Mon,Weds,Sat),1,`%in%`,c("jon","mary"))) > 0)
# week Mon Tues Weds Thurs Fri Sat flag
# 1 1 jon jon jon jon mary mary TRUE
# 2 2 jane jane jane jane jane jane FALSE
# 3 3 mary mary mary mary mary jane TRUE
我认为across
的问题在于,它试图对每个列进行操作,而不是对所有这些内容进行汇总。让我们尝试purrr::pmap
代替L
library(purrr)
df %>%
mutate(flag = pmap(list(Mon,~ +any(unlist(...) %in% c("jon","mary"))))
# week Mon Tues Weds Thurs Fri Sat flag
# 1 1 jon jon jon jon mary mary 1
# 2 2 jane jane jane jane jane jane 0
# 3 3 mary mary mary mary mary jane 1
三分之一(使用您对c_across
的请求):
df %>%
rowwise() %>%
mutate(flag = +any(c_across(c(Mon,Sat)) %in% c("jon","mary"))) %>%
ungroup()
# # A tibble: 3 x 8
# week Mon Tues Weds Thurs Fri Sat flag
# <int> <chr> <chr> <chr> <chr> <chr> <chr> <int>
# 1 1 jon jon jon jon mary mary 1
# 2 2 jane jane jane jane jane jane 0
# 3 3 mary mary mary mary mary jane 1
,
如果不使用rowwise
和map
遍历列,则可以提高效率,而不是reduce
遍历
library(purrr)
library(dplyr)
df %>%
mutate(flag = map(select(.,Mon,"mary")) %>%
reduce(`|`) %>% `+`)
# week Mon Tues Weds Thurs Fri Sat flag
#1 1 jon jon jon jon mary mary 1
#2 2 jane jane jane jane jane jane 0
#3 3 mary mary mary mary mary jane 1
base R
中的对应选项是lapply/Reduce
df$flag <- +(Reduce(`|`,lapply(df[c('Mon','Weds','Sat')],"mary"))))
数据
df <- structure(list(week = 1:3,Mon = c("jon","jane","mary"),Tues = c("jon",Weds = c("jon",Thurs = c("jon",Fri = c("mary",Sat = c("mary","jane")),class = "data.frame",row.names = c(NA,-3L
))
,
这是使用rowSums
+ Reduce
df$flag <- +(rowSums(
Reduce(
`+`,lapply(
c("jon",`==`,df[c("Mon","Weds","Sat")]
)
)
) > 0)
这样
week Mon Tues Weds Thurs Fri Sat flag
1 1 jon jon jon jon mary mary 1
2 2 jane jane jane jane jane jane 0
3 3 mary mary mary mary mary jane 1
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。