如何解决split_rows在结果周围产生引号
separate_rows
在操作后围绕结果值生成引号("
)。这是正常行为吗?如何在同一操作中阻止该操作,而又无需在操作后显式删除它们?
df <- data.frame(a = c("c_1","c_2","c_3","c_4","c_5"),b = c("a (+1)","b (+2)","a (+2),c (+5)","e (+2)","b (+2),e (+5)"))
a b
1 c_1 a (+1)
2 c_2 b (+2)
3 c_3 a (+2),c (+5)
4 c_4 e (+2)
5 c_5 b (+2),e (+5)
df %>% tidyr::separate_rows(b,sep = ",",convert = TRUE)
# # A tibble: 7 x 2
# a b
# <chr> <chr>
# 1 c_1 "a (+1)"
# 2 c_2 "b (+2)"
# 3 c_3 "a (+2)"
# 4 c_3 " c (+5)"
# 5 c_4 "e (+2)"
# 6 c_5 "b (+2)"
# 7 c_5 " e (+5)"
问题是not about将一行分成多行。在我的尝试中已经显示了它,并且代码可以实现它。
解决方法
这些引号不是您所想的那样,这只是 tidyverse 的打印方式-试图显示存在空格。见下文:
library(tidyverse)
x1 <- df %>% separate_rows(b,sep = ",",convert = TRUE)
x2 <- as.data.frame(x1)
x1
# # A tibble: 7 x 2
# a b
# <chr> <chr>
# 1 c_1 "a (+1)"
# 2 c_2 "b (+2)"
# 3 c_3 "a (+2)"
# 4 c_3 " c (+5)"
# 5 c_4 "e (+2)"
# 6 c_5 "b (+2)"
# 7 c_5 " e (+5)"
x2
# a b
# 1 c_1 a (+1)
# 2 c_2 b (+2)
# 3 c_3 a (+2)
# 4 c_3 c (+5)
# 5 c_4 e (+2)
# 6 c_5 b (+2)
# 7 c_5 e (+5)
identical(x1$b,x2$b)
# [1] TRUE
,
在sep
中的逗号后添加空格:
tidyr::separate_rows(df,b,\\s",convert = TRUE)
# a b
# <chr> <chr>
#1 c_1 a (+1)
#2 c_2 b (+2)
#3 c_3 a (+2)
#4 c_3 c (+5)
#5 c_4 e (+2)
#6 c_5 b (+2)
#7 c_5 e (+5)
,
这是一个data.table
选项
setDT(df)
df[,strsplit(b,"),by = a]
给予
a V1
1: c_1 a (+1)
2: c_2 b (+2)
3: c_3 a (+2)
4: c_3 c (+5)
5: c_4 e (+2)
6: c_5 b (+2)
7: c_5 e (+5)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。