如何解决手动计算 - 在第二阶段具有 tobit 分布的工具变量,具有鲁棒性错误的不同结果
交叉发布于 CrossValidated。
在第一阶段使用 ols 分布并在第二阶段使用 tobit 分布时,我正在尝试纠正我的标准错误。出于某种原因,我在纠正时得到了不同的估计,但我不知道为什么..
有几件事要说清楚。在这个例子中,IV 的估计值只有 0.05 off。在我的实际数据中,它是 14% -> 22%。我看到截距和 logSigma 也非常不同。我不确定这在多大程度上重要,但我想指出这一点。
数据
set.seed(2)
a <- 2 # structural parameter of interest
b <- 1 # strength of instrument
rho <- 0.5 # degree of endogeneity
N <- 1000
z <- rnorm(N)
res1 <- rnorm(N)
res2 <- res1*rho + sqrt(1-rho*rho)*rnorm(N)
x <- z*b + res1
ys <- x*a + res2
d <- (ys>0) #dummy variable
y <- round(10-(d*ys))
random_variable <- rnorm(100,mean = 0,sd = 1)
library(data.table)
DT_1 <- data.frame(y,x,z,random_variable)
DT_2 <- structure(list(ID = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50),year = c(1995,1995,2000,2005,2010,2015,2015),Group = c("A","A","B","C","C"),event = c(1,1,1),win_or_lose = c(-1,-1,0)),row.names = c(NA,-50L),class = c("tbl_df","tbl","data.frame"))
DT_1 <- setDT(DT_1)
DT_2 <- setDT(DT_2)
DT_2 <- rbind(DT_2,DT_2 [rep(1:50,19),])
sandboxA <- cbind(DT_1,DT_2)
sandboxB <- cbind(DT_1,DT_2)
回归
require(AER)
require(censReg)
first_stage_ols <- lm(x ~ z + random_variable + year,data=sandboxA)
yhat <- first_stage_ols$fitted.values
attr(yhat,"class")[1] <- "numeric"
yhat <- as.data.frame(yhat)
yhat <- unlist(yhat)
dataset <- cbind(sandboxA,yhat)
form_2st_yhat <- as.formula("y ~ yhat + random_variable + year")
second_stage_tobit <<- AER::tobit(form_2st_yhat,left=0,right=10,data=sandboxA,na.action = na.exclude)
second_stage_tobit_b <<- censReg(form_2st_yhat,data=sandboxA)
summary(second_stage_tobit)
summary(second_stage_tobit_b)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 33.34972 31.49314 1.059 0.290
yhat -2.20394 0.12052 -18.287 <2e-16 ***
random_variable -0.03412 0.11147 -0.306 0.760
year -0.01146 0.01571 -0.730 0.466
Log(scale) 1.08955 0.03628 30.035 <2e-16 ***
Estimate Std. error t value Pr(> t)
(Intercept) 33.34972 31.49313 1.059 0.290
yhat -2.20394 0.12052 -18.287 <2e-16 ***
random_variable -0.03412 0.11147 -0.306 0.760
year -0.01146 0.01571 -0.730 0.466
logSigma 1.08955 0.03628 30.035 <2e-16 ***
更正标准错误 (Link)
reduced.form <- lm(x ~ z + random_variable + year,data=sandboxB)
summary(reduced.form)
consistent.tobit <- censReg(y~fitted(reduced.form)+residuals(reduced.form),data=sandboxB)
summary(consistent.tobit)
FUN <- function(x) {
reduced.form <- lm(x ~ z + random_variable + year,data=x)
censReg(y ~ fitted(reduced.form) + residuals(reduced.form))$estimate
}
library(censReg)
set.seed(42)
R <- 200
res <- t(replicate(R,FUN(sandbox[sample(nrow(sandboxB),nrow(sandboxB),replace=T),])))
library(matrixStats)
b <- consistent.tobit$estimate
SE <- colSds(res)
z <- consistent.tobit$estimate/SE
p <- 2 * pt(-abs(z),df = Inf)
ci <- colQuantiles(res,probs=c(.025,.975))
res <- signif(cbind(b,SE,p,ci),4)
res
b SE z p 2.5% 97.5%
(Intercept) 10.26000 0.0055910 1835.00 0 8.54300 8.5690
fitted(reduced.form) -2.15500 0.0560100 -38.48 0 -0.09655 0.1241
residuals(reduced.form) -2.71400 0.0689700 -39.35 0 -0.12450 0.1522
logSigma 0.05015 0.0009665 51.88 0 0.72270 0.7259
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。