如何解决如何在不重复 R 中的代码的情况下从线性模型中提取系数?
我正在使用 Montecarlo 模拟来预测 mtcars 数据中的 mpg。我想提取数据帧中所有变量的系数来计算每辆车的 mpg 比另一辆车低多少次。例如,Toyota Corona 的 mpg 预测值比 Datsun 710 少多少次。这是我仅使用两个自变量的初始代码。我想扩展此选择以使用数据框中的所有变量,而不必手动包含数据框中的所有变量。 有什么办法可以做到这一点吗?
library(pacman)
pacman::p_load(data.table,fixest,stargazer,dplyr,magrittr)
df <- mtcars
fit <- lm(mpg~cyl + hp,data = df)
fit$coefficients[1]
beta_0 = fit$coefficients[1] # Intercept
beta_1 = fit$coefficients[2] # Slope
beta_2 = fit$coefficients[3]
set.seed(1) # Seed
n = 1000 # Sample size
M = 500 # Number of experiments/iterations
estimates_DT <- do.call("rbind",lapply(1:M,function(i) {
# Generate data
U_i = rnorm(n,mean = 0,sd = 2) # Error
X_i_1 = rnorm(n,mean = 5,sd = 5) # First independent variable
X_i_2 = rnorm(n,sd = 5) #Second ndependent variable
Y_i = beta_0 + beta_1*X_i_1 + beta_2*X_i_2 + U_i # Dependent variable
# Formulate data.table
data_i = data.table(Y = Y_i,X1 = X_i_1,X2 = X_i_2)
# Run regressions
ols_i <- fixest::feols(data = data_i,Y ~ X1 + X2)
ols_i$coefficients
}))
estimates_DT <- setNames(data.table(estimates_DT),c("beta_0","beta_1","beta_2"))
compareCarEstimations <- function(carname1="Mazda RX4",carname2="Datsun 710") {
car1data <- mtcars[rownames(mtcars) == carname1,c("cyl","hp")]
car2data <- mtcars[rownames(mtcars) == carname2,"hp")]
predsCar1 <- estimates_DT[["beta_0"]] + car1data$cyl*estimates_DT[["beta_1"]]+car1data$hp*estimates_DT[["beta_2"]]
predsCar2 <- estimates_DT[["beta_0"]] + car2data$cyl*estimates_DT[["beta_1"]]+car2data$hp*estimates_DT[["beta_2"]]
list(
car1LowerCar2 = sum(predsCar1 < predsCar2),car2LowerCar1 = sum(predsCar1 >= predsCar2)
)
}
compareCarEstimations("Toyota Corona","Datsun 710")
解决方法
我还没有完全完成你的例子,但这里是如何构建一组随机预测变量并将它们与系数向量矩阵相乘以获得预测值的核心:
设置:
df <- mtcars
fit <- lm(mpg~cyl + hp,data = df)
n <- 1000
beta <- coef(fit) ## parameter vector (includes intercept)
npar <- length(beta)
X <- matrix(rnorm(n*npar),ncol=npar) ## includes intercept
## scale columns by the corresponding sd
## (all identical in this case)
X <- sweep(X,MARGIN=2,FUN="*",STATS=rep(5,npar))
## shift columns by the corresponding mean
## (all identical in this case)
X <- sweep(X,FUN="+",npar))
Y0 <- X %*% beta
Y <- rnorm(n,mean=Y0,sd=2)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。