如何解决使用predict函数绘制CI间隔时出现问题
我试图以95%的置信区间绘制线性模型的结果,如下所示:
fem:
+---------------------+------------+-------------------+--------+-------------------+--------------------+-------------------+---------------------+--------------------+
| "Sitio" | "Zona" | "ID" | "Wg_g" | "GSI" | "K" | "Klog" | "Wglog" | "GSIlog" |
+---------------------+------------+-------------------+--------+-------------------+--------------------+-------------------+---------------------+--------------------+
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -27" | 2.692 | 9.15646258503401 | 0.0261364929449249 | -1.58275268748418 | 0.430075055551939 | 0.961727725139782 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -29" | 1.162 | 6.24731182795699 | 0.0255144032921811 | -1.59321458410006 | 0.0652061280543119 | 0.795693183836396 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -30" | 2.669 | 12.0769230769231 | 0.0257763522379356 | -1.58877854218143 | 0.426348573787508 | 1.0819563001024 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -32" | 2.104 | 8.99145299145299 | 0.0248620897755187 | -1.60446236966734 | 0.323045735481701 | 0.953829878071559 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -33" | 2.52 | 10.9565217391304 | 0.0259964554398148 | -1.58508586310111 | 0.401400540781544 | 1.03967270476395 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -34" | 1.434 | 5.64566929133858 | 0.0278303401108612 | -1.5554814861788 | 0.156549151331781 | 0.751715434711843 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -36" | 0.253 | 1.28426395939086 | 0.0244916125551217 | -1.61098261950021 | -0.596879478824182 | 0.108654295014225 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -38" | 0.302 | 1.5978835978836 | 0.0259259259259259 | -1.58626572414473 | -0.519993057042849 | 0.203545138783906 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -39" | 4.357 | 12.342776203966 | 0.0272580768461556 | -1.56450478843405 | 0.639187559935754 | 1.09141285454793 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -40" | 2.276 | 8.75384615384615 | 0.026 | -1.58502665202918 | 0.357172257723034 | 0.942198909752216 |
| "Tablas de Daimiel" | "Tablazo" | "L. gibbosus -41" | 3.358 | 11.2307692307692 | 0.0244073065190365 | -1.6124801447312 | 0.52608069180203 | 1.0504095034776 |
| "Las Madres" | "Butrones" | "L.gibbosus -05" | 0.027 | 0.673316708229426 | 0.0176666769465286 | -1.75284513241212 | -1.56863623584101 | -0.171780608461195 |
| "Las Madres" | "Butrones" | "L.gibbosus -10" | 0.03 | 0.761421319796954 | 0.0157570376769167 | -1.80252542653517 | -1.52287874528034 | -0.118374967105912 |
| "Las Madres" | "Butrones" | "L.gibbosus -21" | 0.183 | 1.04214123006834 | 0.0192401878876662 | -1.71579069122865 | -0.737548910269571 | 0.0179265781603458 |
| "Las Madres" | "Butrones" | "L.gibbosus -23" | 1.143 | 5.94383775351014 | 0.0224289254993439 | -1.64919153162806 | 0.0580462303952817 | 0.774066946156802 |
| "Las Madres" | "Butrones" | "L.gibbosus -25" | 0.793 | 5.98490566037736 | 0.0194432052967693 | -1.71123213817768 | -0.100726812682396 | 0.777057309044777 |
| "Las Madres" | "Butrones" | "L.gibbosus -26" | 0.989 | 3.81853281853282 | 0.0153694695871428 | -1.81334112009634 | -0.0048037084028206 | 0.581896527515928 |
| "Las Madres" | "Butrones" | "L.gibbosus -27" | 0.069 | 0.745945945945946 | 0.0187611933335902 | -1.72673954113229 | -1.16115090926274 | -0.127292642001777 |
+---------------------+------------+-------------------+--------+-------------------+--------------------+-------------------+---------------------+--------------------+
lm1 <- lm(Wglog ~ Klog,data = fem)
newx <- seq(min(fem$Klog),max(fem$Klog),length.out = length(fem$Klog))
pred1 <- predict(lm1,new=data.frame(x=newx),level=.95,interval="confidence")
但是预测值没有意义,因为它们完全不稳定且以某种方式无序:
plot(fem$Wglog ~ fem$Klog,ylab = "Log gonad weight",xlab = "",xaxt = "n",ylim = c(-3,3),pch = c(4,20),font.lab = 2)
abline(lm1,col = "grey",lwd = 2) #a straight line with the actual coefficients of the model
lines(x = newx,y = as.vector(pred1$fit[,1]),col="blue",lty=2,lwd = 2) #this line,if I didn´t get it wrong,should be the same as the abline
lines(x = newx,2]),col="black",lwd = 2) #these represent the confidence interval
lines(x = newx,3]),lwd = 2)
you can check the resulting plot here
如果我对预测值进行排序,它们会更有意义,但无论如何它们似乎都是错误的:
lines(x = newx,y = sort(as.vector(pred1$fit[,1])),lwd = 2)
lines(x = newx,2])),3])),lwd = 2)
有人知道我在做什么错吗?非常感谢!
解决方法
此行
pred1 <- predict(lm1,new=data.frame(x=newx),level=.95,interval="confidence")
应该是
pred1 <- predict(lm1,new=data.frame(Klog=newx),interval="confidence")`
因为您要提供的新数据框的变量与模型中的变量不同(即,它没有名为Klog
的变量),所以您只是从原始模型。同样,predict()
函数的结果是一个矩阵,而不是列表或数据帧。以下是有效的代码:
fem <- tibble::tribble(
~"Klog",~"Wglog",-1.58275268748418,0.430075055551939,-1.59321458410006,0.0652061280543119,-1.58877854218143,0.426348573787508,-1.60446236966734,0.323045735481701,-1.58508586310111,0.401400540781544,-1.5554814861788,0.156549151331781,-1.61098261950021,-0.596879478824182,-1.58626572414473,-0.519993057042849,-1.56450478843405,0.639187559935754,-1.58502665202918,0.357172257723034,-1.6124801447312,0.52608069180203,-1.75284513241212,-1.56863623584101,-1.80252542653517,-1.52287874528034,-1.71579069122865,-0.737548910269571,-1.64919153162806,0.0580462303952817,-1.71123213817768,-0.100726812682396,-1.81334112009634,-0.0048037084028206,-1.72673954113229,-1.16115090926274 )
lm1 <- lm(Wglog ~ Klog,data = fem)
newx <- seq(min(fem$Klog),max(fem$Klog),length.out = length(fem$Klog))
pred1 <- predict(lm1,interval="confidence")
plot(fem$Wglog ~ fem$Klog,ylab = "Log gonad weight",xlab = "",xaxt = "n",ylim = c(-3,3),pch = c(4,20),font.lab = 2)
abline(lm1,col = "grey",lwd = 2) #a straight line with the actual coefficients of the model
lines(x = newx,y = pred1[,1],col="blue",lty=2,lwd = 2) #this line,if I didn´t get it wrong,should be the same as the abline
lines(x = newx,2],col="black",lwd = 2) #these represent the confidence interval
lines(x = newx,3],lwd = 2)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。