如何解决R中与调查设计相关的问题
我使用full_join
包的dplyr
函数加入了五个数据集。第一个数据集有6,165行;第二个数据集有5,827行。最终加入的数据集具有33,503行。
我使用以下代码将五个数据集结合在一起。
n2<-full_join(n96,n01)
n3<-full_join(n2,n06)
n4<-full_join(n3,n11)
nf<-full_join(n4,n16)
View(nf)
最终数据集如下所示。...
v000 v005 age v021 v022 v023 v024 resi region v102 education pregnant v445 v501 v717 wealth occupation marital wgtv BMI obov
<chr> <dbl> <dbl+l> <dbl> <dbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+lbl> <dbl+lb> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+lbl> <dbl+l> <dbl> <dbl> <fct>
1 NP3 412612 6 [40-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~ 2285 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413 22.8 0
2 NP3 412612 3 [25-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~ 2159 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413 21.6 0
3 NP3 412612 4 [30-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~ 2167 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413 21.7 0
4 NP3 412612 5 [35-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~ 2039 1 [mar~ 4 [agr~ 4 [ric~ 2 [cleric~ 1 [mar~ 0.413 20.4 0
5 NP3 412612 2 [20-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 1 [prima~ 0 [no o~ 2163 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413 21.6 0
6 NP3 412612 5 [35-~ 101 51 0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~ 3785 1 [mar~ 4 [agr~ 2 [poo~ 2 [cleric~ 1 [mar~ 0.413 37.8 2
# ... with 6 more variables: over <fct>,age1 <dbl+lbl>,working_status <dbl+lbl>,education1 <dbl+lbl>,year <dbl>,stra <fct>
因为它是一个复杂的调查数据集。我使用了调查设计。
svs<-svydesign(id=nf$v021,strata=nf$stra,nest=TRUE,weights=nf$wgtv,data=nf)
有效。在分析过程中,我发现了与对象相关的错误。为了解决这个问题,我使用了以下代码-
svs1 <-
update(
svs,one=1,edu = factor( education,levels = c(0,1,2,3),labels =
c("no edu","primary","secondary","higher") ),wealth =factor( wealth,levels = c(1,3,4,5),labels =
c("poorest","poorer","middle","richer","richest")),marital = factor( marital,1),labels =
c( "never married","married")),occu = factor( occu,labels =
c( "not working","professional/technical/manageral/clerial/sale/services","agricultural","skilled/unskilled manual") ),age1 = factor(age1,labels =
c( "early","mid","late") ),obov= factor(obov,2),labels=
c("normal","overweight","obese")),over= factor(over,labels=
c("normal","overweight/obese")),working_status= factor (working_status,labels = c("not working","working")),education1= factor (education1,labels=
c("no education","secondary/secondry+")),resi= factor (resi,levels= c(0,labels= c("urban","rural"))
)
现在,我发现了以下错误
Error in `[<-.data.frame`(`*tmp*`,newnames[j],value = c(3L,3L,:
replacement has 12674 rows,data has 33503
请提出如何解决此错误的建议?
解决方法
我不确定update
函数的工作方式,但似乎您想更改变量的因子水平。您可以先将其传递到nf
函数中,然后再在svydesign
数据框中进行操作。
library(dplyr)
nf <- nf %>%
mutate(edu = factor( education,levels = c(0,1,2,3),labels =
c("no edu","primary","secondary","higher") ),wealth =factor( wealth,levels = c(1,3,4,5),labels =
c("poorest","poorer","middle","richer","richest")),marital = factor( marital,1),labels =
c( "never married","married")),occu = factor( occu,labels =
c( "not working","professional/technical/manageral/clerial/sale/services","agricultural","skilled/unskilled manual") ),age1 = factor(age1,labels =
c( "early","mid","late") ),obov= factor(obov,2),labels=
c("normal","overweight","obese")),over= factor(over,"overweight/obese")),working_status= factor (working_status,labels = c("not working","working")),education1= factor (education1,labels=
c("no education","secondary/secondry+")),resi= factor (resi,levels= c(0,labels= c("urban","rural")))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。