dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？将值重新编码为小写和下划线编辑

如何解决dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？将值重新编码为小写和下划线编辑

有时，我的更高级别 tibble 中的嵌套列表是 NULL。我想在使用 dplyr::mutate() 时忽略这些列表。

示例

将值重新编码为小写和下划线

数据

library(tibble)

df <-
  tibble(movies = c("The Shawshank Redemption","The Godfather","The Godfather: Part II","The Dark Knight","12 Angry Men"),continents = c("Asia","Australia","America","Africa","Europe"),michaels = c("Michael Jackson","Michael Jordan","Mike Tyson","Michael Phelps","Michael Schumacher"))

df <- add_column(df,ignore_me = list(NULL))

df

## # A tibble: 5 x 4
##   movies                   continents michaels           ignore_me
##   <chr>                    <chr>      <chr>              <list>   
## 1 The Shawshank Redemption Asia       Michael Jackson    <NULL>   
## 2 The Godfather            Australia  Michael Jordan     <NULL>   
## 3 The Godfather: Part II   America    Mike Tyson         <NULL>   
## 4 The Dark Knight          Africa     Michael Phelps     <NULL>   
## 5 12 Angry Men             Europe     Michael Schumacher <NULL>

尝试重新编码值

library(dplyr) # version 1.0.2
library(snakecase)

df %>%
  mutate(across(everything(),snakecase::to_any_case))

错误：mutate() 输入 ..1 有问题。
x 参数不是字符向量
i 输入 ..1 是 across(everything(),snakecase::to_any_case)。

显然，以下任一方法都可以：

df %>% mutate(across(c(movies,continents,michaels),snakecase::to_any_case))
# or
df %>% mutate(across(-ignore_me,snakecase::to_any_case))

##   movies                   continents michaels           ignore_me
##   <chr>                    <chr>      <chr>              <list>   
## 1 the_shawshank_redemption asia       michael_jackson    <NULL>   
## 2 the_godfather            australia  michael_jordan     <NULL>   
## 3 the_godfather_part_ii    america    mike_tyson         <NULL>   
## 4 the_dark_knight          africa     michael_phelps     <NULL>   
## 5 12_angry_men             europe     michael_schumacher <NULL>

但实际上我不能期望哪个列/嵌套列表会是 NULL，因此我需要我的代码简单地忽略这样的 NULL 但仍然适用在非 NULL 列上。

编辑

上面的原始 df 可以通过完全忽略 list 来轻松解决问题。但数据通常也可以是：

df_2 <-
  tibble(movies = c("The Shawshank Redemption","Michael Schumacher"))

df_2 <- add_column(df_2,ignore_me = list(NULL))

set.seed(2021) ; df_2 <- mutate(df_2,across(sample(colnames(df_2),1),as.list))

df_2

##   movies                   continents michaels  ignore_me
##   <chr>                    <chr>      <list>    <list>   
## 1 The Shawshank Redemption Asia       <chr [1]> <NULL>   
## 2 The Godfather            Australia  <chr [1]> <NULL>   
## 3 The Godfather: Part II   America    <chr [1]> <NULL>   
## 4 The Dark Knight          Africa     <chr [1]> <NULL>   
## 5 12 Angry Men             Europe     <chr [1]> <NULL>

解决方法

您可以忽略所有列表列：

library(dplyr)
df %>% mutate(across(where(Negate(is.list)),snakecase::to_any_case))

或者如果不是所有的 list 列都将是 NULL，您可以通过检查它们的长度来专门找到具有 NULL 值的列并忽略长度为 0 的列.

df %>% mutate(across(where(~!all(lengths(.) == 0)),snakecase::to_any_case))


#  movies                   continents michaels           ignore_me
#  <chr>                    <chr>      <chr>              <list>   
#1 the_shawshank_redemption asia       michael_jackson    <NULL>   
#2 the_godfather            australia  michael_jordan     <NULL>   
#3 the_godfather_part_ii    america    mike_tyson         <NULL>   
#4 the_dark_knight          africa     michael_phelps     <NULL>   
#5 12_angry_men             europe     michael_schumacher <NULL>

对于修改后的 df_2，我们可以使用：

df_2$michaels[[3]] <- c(df_2$michaels[[3]],df_2$michaels[[4]]) 

df_2 %>% 
  mutate(across(where(~all(lengths(.) > 0)),~relist(to_any_case(unlist(.)),.)))


#  movies                   continents michaels  ignore_me
#  <chr>                    <chr>      <list>    <list>   
#1 the_shawshank_redemption asia       <chr [1]> <NULL>   
#2 the_godfather            australia  <chr [1]> <NULL>   
#3 the_godfather_part_ii    america    <chr [2]> <NULL>   
#4 the_dark_knight          africa     <chr [1]> <NULL>   
#5 12_angry_men             europe     <chr [1]> <NULL>

添加 purrr 的一个选项可能是：

df %>%
 mutate(across(where(~ !all(map_lgl(.,is.null))),to_any_case))

  movies                   continents michaels           ignore_me
  <chr>                    <chr>      <chr>              <list>   
1 the_shawshank_redemption asia       michael_jackson    <NULL>   
2 the_godfather            australia  michael_jordan     <NULL>   
3 the_godfather_part_ii    america    mike_tyson         <NULL>   
4 the_dark_knight          africa     michael_phelps     <NULL>   
5 12_angry_men             europe     michael_schumacher <NULL>

对于第二个数据集：

df_2 %>%
 mutate(across(where(~ !all(map_lgl(.,~ to_any_case(unlist(.)))) 

  movies                   continents michaels           ignore_me
  <chr>                    <chr>      <chr>              <list>   
1 the_shawshank_redemption asia       michael_jackson    <NULL>   
2 the_godfather            australia  michael_jordan     <NULL>   
3 the_godfather_part_ii    america    mike_tyson         <NULL>   
4 the_dark_knight          africa     michael_phelps     <NULL>   
5 12_angry_men             europe     michael_schumacher <NULL>

dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？ 将值重新编码为小写和下划线 编辑

如何解决dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？ 将值重新编码为小写和下划线 编辑

示例

将值重新编码为小写和下划线

编辑

解决方法

相关推荐

dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？将值重新编码为小写和下划线编辑

如何解决dplyr::mutate() -- 在 tibble 嵌套列表中，如何忽略 NULL 嵌套列表？将值重新编码为小写和下划线编辑