如何解决正则表达式字符串匹配单词模式
我用抗生素有这种模式
atb <- c("acefa","ampicilin","fortum")
还有这个数据框
DF1 <- structure(list(ID = 1:3,Text = c("Person 1 take acefa and ampicilin","fortum and acefa are antibiotics","Person 3 has no antibiotics but ampicilin")),class = "data.frame",row.names = c(NA,-3L))
DF1
ID Text
1 Person 1 take acefa and ampicilin
2 fortum and acefa are antibiotics
3 Person 3 has no antibiotics but ampicilin
我想要这个
DF1
ID Text atb
1 Person 1 take acefa and ampicilin c("acefa","ampicilin")
2 fortum and acefa are antibiotics c("fortum","acefa")
3 Person 3 has no antibiotics but ampicilin ampicilin
我试过了
DF1%>%
mutate(atb = regmatches(Text,regexec(atb,Text)))
和
DF1%>%
mutate(atb = str_extract_all(Text,atb)))
但它不起作用。
但是,它可以像这样与 grepl 一起使用
DF1%>%
mutate(atb = grepl(atb,Text)))
我可以从模式中获取包含单词的列吗?
解决方法
设置正则表达式并使用strapplyc
:
library(dplyr)
library(gsubfn)
result <- DF1 %>%
mutate(atb = strapplyc(Text,paste(atb,collapse = "|")))
str(result$atb)
给予:
List of 3
$ : chr [1:2] "acefa" "ampicilin"
$ : chr [1:2] "fortum" "acefa"
$ : chr "ampicilin"
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。