如何解决创建按组加起来等于1的随机比率
我有如下数据集:
panelID= c(1:50)
year= c(2005,2010)
country = c("A","B","C","D","E","F","G","H","I","J")
urban = c("A","C")
indust = c("D","F")
sizes = c(1,2,3,4,5)
n <- 2
library(AER)
library(data.table)
library(dplyr)
set.seed(123)
DT <- data.table( country = rep(sample(country,length(panelID),replace = T),each = n),year = c(replicate(length(panelID),sample(year,n))),sales= round(rnorm(10,10,10),2),industry = rep(sample(indust,urbanisation = rep(sample(urban,size = rep(sample(sizes,each = n))
DT <- DT %>%
group_by(country) %>%
mutate(base_rate = as.integer(runif(1,12.5,37.5))) %>%
group_by(country,year) %>%
mutate(taxrate = base_rate + as.integer(runif(1,-2.5,+2.5)))
DT <- DT %>%
group_by(country,year) %>%
mutate(vote = sample(c(0,1),votewon = ifelse(vote==1,sample(c(0,0))
我想向该数据集添加一个名为ratio
的变量。我希望ratio
是0到1之间的随机数,并且我希望这些国家/地区的比率之和为1。
我将如何创建这样的列?我唯一想到的就是手动创建矢量,这些矢量加起来等于1,然后从这些矢量中采样。
编辑:国家/地区的条目不相等:
> table(DT$country)
A B C D E F G H I J
6 10 14 6 14 10 10 8 10 12
ratio_sample_6 <- c(0.1,0.2,0.3,0.05,0.15,0.2)
DT[,ratio:=sample(ratio_sample_6,replace = FALSE),by="country"]
但是即使那样我也无法上班。有什么建议吗?
解决方法
选择随机数并按国家/地区归一化
## data.table version
DT[,ratio := runif(.N)][,ratio := ratio / sum(ratio),by = "country"]
## dplyr version
DT %>% group_by(country) %>%
mutate(
ratio = runif(n()),ratio = ratio / sum(ratio)
)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。