如何基于其他两个变量的条件填充新变量数据

如何解决如何基于其他两个变量的条件填充新变量数据

我想创建一个变量，该变量在特定年份中按国家/地区使用。

Country Year Price Price_2018
   A    2016  1      4
   A    2017  3      4
   A    2018  4      4
   B    2016  1      5
   B    2017  7      5
   B    2018  5      5
   C    2016  1      3
   C    2017  6      3
   C    2018  3      3

如上所述，我想创建一个变量Price_2018，对于每个国家/地区，取2018年的价格，并使用该值填充每个观察值。有人可以在这里提供帮助吗？预先非常感谢。

解决方法

您似乎想每年创建一个变量，因此可以尝试使用此方法重塑数据，然后使用tidyverse函数进行合并：

library(tidyverse)
#Code
newdf <- df %>% left_join(df %>%
  mutate(Year=paste0('Price_',Year)) %>%
  pivot_wider(names_from = Year,values_from=Price))

输出：

  Country Year Price Price_2016 Price_2017 Price_2018
1       A 2016     1          1          3          4
2       A 2017     3          1          3          4
3       A 2018     4          1          3          4
4       B 2016     1          1          7          5
5       B 2017     7          1          7          5
6       B 2018     5          1          7          5
7       C 2016     1          1          6          3
8       C 2017     6          1          6          3
9       C 2018     3          1          6          3

如果只需要2018：

#Code 2
newdf <- df %>% left_join(df %>% filter(Year==2018) %>%
  mutate(Year=paste0('Price_',values_from=Price))

输出：

  Country Year Price Price_2018
1       A 2016     1          4
2       A 2017     3          4
3       A 2018     4          4
4       B 2016     1          5
5       B 2017     7          5
6       B 2018     5          5
7       C 2016     1          3
8       C 2017     6          3
9       C 2018     3          3

使用了一些数据：

#Data
df <- structure(list(Country = c("A","A","B","C","C"),Year = c(2016L,2017L,2018L,2016L,2018L),Price = c(1L,3L,4L,1L,7L,5L,6L,3L)),row.names = c(NA,-9L),class = "data.frame")

我们可以使用==（假设每个“国家/地区”只有唯一的“年份”）

library(dplyr)
df1 %>%
    group_by(Country) %>%
    mutate(Price_2018 = Price[Year == 2018])

-输出

# A tibble: 9 x 4
# Groups:   Country [3]
#  Country  Year Price Price_2018
#  <chr>   <int> <int>      <int>
#1 A        2016     1          4
#2 A        2017     3          4
#3 A        2018     4          4
#4 B        2016     1          5
#5 B        2017     7          5
#6 B        2018     5          5
#7 C        2016     1          3
#8 C        2017     6          3
#9 C        2018     3          3

或match

df1 %>%
     group_by(Country) %>%
     mutate(Price_2018 = Price[match(2018,Year)])

如果我们需要创建多个“年份”列，则更简单的选择是map

library(purrr)
map_dfc(unique(df1$Year),~ df1 %>%  group_by(Country) %>%
    transmute(!! str_c('Price_',.x) :=  Price[Year == .x]) %>%
    ungroup %>%
    select(-Country)) %>%
    mutate(df1,.)
#  Country Year Price Price_2016 Price_2017 Price_2018
#1       A 2016     1          1          3          4
#2       A 2017     3          1          3          4
#3       A 2018     4          1          3          4
#4       B 2016     1          1          7          5
#5       B 2017     7          1          7          5
#6       B 2018     5          1          7          5
#7       C 2016     1          1          6          3
#8       C 2017     6          1          6          3
#9       C 2018     3          1          6          3

数据

df1 <- structure(list(Country = c("A",class = "data.frame")

如何基于其他两个变量的条件填充新变量 数据

如何解决如何基于其他两个变量的条件填充新变量 数据

解决方法

数据

相关推荐

如何基于其他两个变量的条件填充新变量数据

如何解决如何基于其他两个变量的条件填充新变量数据