如何解决预处理函数 R 加载包裹数据加载缺失值预处理
我对 R 很陌生。
我正在尝试按照以下程序进行操作:
imputedData <- preProcess( select(train,-SalePrice),method = c("center","scale","knnImpute","nzv",'YeoJohnson')
)
#install.packages('RANN')
library(RANN)
trainTrans <- predict(imputedData,train)
我有这个错误
Must subset rows with a valid subscript vector. x Subscript
nn$nn.idx must be a simple vector,not a matrix.
我已经安装了 Caret 包
训练数据集是我从 CSV 文件导入的表
解决方法
这是程序
加载包裹
knitr::opts_chunk$set(echo = TRUE,cache = TRUE,message = FALSE,warning = FALSE)
library(tidyverse)
library(caret)
library(GGally)
library(lattice)
library(corrplot)
library(factoextra)
library(FactoMineR)
library(magrittr)
theme_set(theme_bw())
set.seed(181019)
数据加载
train <- readr::read_csv("train.csv")
test <- readr::read_csv("test.csv")
缺失值
missing_threshold <- .4
is_too_scarce <- lapply(select(train,-SalePrice),function(x) mean(is.na(x)) > missing_threshold)
is_too_scarce <- map_lgl(select(train,~mean(is.na(.x)) > missing_threshold)
not_too_scarce <- names(is_too_scarce)[!is_too_scarce]
train <- select(train,SalePrice,not_too_scarce)
train %<>% select(SalePrice,not_too_scarce)
test %<>% select(not_too_scarce)
预处理
imputedData <- preProcess( select(train,method = c("center","scale","knnImpute","nzv",'YeoJohnson')
)
#install.packages('RANN')
library(RANN)
testTrans <- predict(imputedData,test)
trainTrans <- predict(imputedData,train)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。