R中与调查设计相关的问题

如何解决R中与调查设计相关的问题

我使用full_join包的dplyr函数加入了五个数据集。第一个数据集有6,165行；第二个数据集有5,827行。最终加入的数据集具有33,503行。我使用以下代码将五个数据集结合在一起。

n2<-full_join(n96,n01)
    n3<-full_join(n2,n06)
    n4<-full_join(n3,n11)
    nf<-full_join(n4,n16)
    View(nf)

最终数据集如下所示。...

 v000    v005     age  v021  v022  v023    v024    resi  region    v102 education pregnant  v445    v501    v717  wealth occupation marital  wgtv   BMI obov 
  <chr>  <dbl> <dbl+l> <dbl> <dbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+lbl> <dbl+lb> <dbl> <dbl+l> <dbl+l> <dbl+l>  <dbl+lbl> <dbl+l> <dbl> <dbl> <fct>
1 NP3   412612 6 [40-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2285 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413  22.8 0    
2 NP3   412612 3 [25-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2159 1 [mar~ 4 [agr~ 1 [poo~ 2 [cleric~ 1 [mar~ 0.413  21.6 0    
3 NP3   412612 4 [30-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2167 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413  21.7 0    
4 NP3   412612 5 [35-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  2039 1 [mar~ 4 [agr~ 4 [ric~ 2 [cleric~ 1 [mar~ 0.413  20.4 0    
5 NP3   412612 2 [20-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 1 [prima~ 0 [no o~  2163 1 [mar~ 4 [agr~ 3 [mid~ 2 [cleric~ 1 [mar~ 0.413  21.6 0    
6 NP3   412612 5 [35-~   101    51     0 1 [pro~ 2 [rur~ 1 [pro~ 2 [rur~ 0 [no ed~ 0 [no o~  3785 1 [mar~ 4 [agr~ 2 [poo~ 2 [cleric~ 1 [mar~ 0.413  37.8 2    
# ... with 6 more variables: over <fct>,age1 <dbl+lbl>,working_status <dbl+lbl>,education1 <dbl+lbl>,year <dbl>,stra <fct>

因为它是一个复杂的调查数据集。我使用了调查设计。

svs<-svydesign(id=nf$v021,strata=nf$stra,nest=TRUE,weights=nf$wgtv,data=nf)

有效。在分析过程中，我发现了与对象相关的错误。为了解决这个问题，我使用了以下代码-

svs1 <- 
  update(
    svs,one=1,edu = factor( education,levels = c(0,1,2,3),labels = 
                    c("no edu","primary","secondary","higher") ),wealth =factor( wealth,levels = c(1,3,4,5),labels = 
                      c("poorest","poorer","middle","richer","richest")),marital = factor( marital,1),labels = 
                        c( "never married","married")),occu = factor( occu,labels =
                           c( "not working","professional/technical/manageral/clerial/sale/services","agricultural","skilled/unskilled manual") ),age1 = factor(age1,labels =
                   c( "early","mid","late") ),obov= factor(obov,2),labels= 
                      c("normal","overweight","obese")),over= factor(over,labels= 
                   c("normal","overweight/obese")),working_status= factor (working_status,labels = c("not working","working")),education1= factor (education1,labels= 
                          c("no education","secondary/secondry+")),resi= factor (resi,levels= c(0,labels= c("urban","rural"))
  )

现在，我发现了以下错误

Error in `[<-.data.frame`(`*tmp*`,newnames[j],value = c(3L,3L,: 
  replacement has 12674 rows,data has 33503

请提出如何解决此错误的建议？

解决方法

我不确定update函数的工作方式，但似乎您想更改变量的因子水平。您可以先将其传递到nf函数中，然后再在svydesign数据框中进行操作。

library(dplyr)
nf <- nf %>%
  mutate(edu = factor( education,levels = c(0,1,2,3),labels = 
                c("no edu","primary","secondary","higher") ),wealth =factor( wealth,levels = c(1,3,4,5),labels = 
                  c("poorest","poorer","middle","richer","richest")),marital = factor( marital,1),labels = 
                    c( "never married","married")),occu = factor( occu,labels =
                 c( "not working","professional/technical/manageral/clerial/sale/services","agricultural","skilled/unskilled manual") ),age1 = factor(age1,labels =
                c( "early","mid","late") ),obov= factor(obov,2),labels= 
               c("normal","overweight","obese")),over= factor(over,"overweight/obese")),working_status= factor (working_status,labels = c("not working","working")),education1= factor (education1,labels= 
                      c("no education","secondary/secondry+")),resi= factor (resi,levels= c(0,labels= c("urban","rural")))

R中与调查设计相关的问题

如何解决R中与调查设计相关的问题

解决方法

相关推荐