如何解决在 R 中对多个数据集执行相同的操作
我正在尝试在 R 中创建一个函数,该函数对许多不同的数据集执行一些特定操作,代码如下:
PHP 8.0.3 (cli) (built: Mar 5 2021 08:38:30) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.3,Copyright (c) Zend Technologies
with Zend OPcache v8.0.3,Copyright (c),by Zend Technologies
执行此操作时,我收到以下错误消息:
checkForRemoteErrors(val) 中的错误: 2个节点产生错误;第一个错误:文件“Adresser og distancer\kommune.csv”不存在或不可读。 getwd()=='C:/Users/KSAlb/OneDrive/Dokumenter'
我需要它来更改文件的名称。这里它应该在文件名中插入 Albertslund 而不是 kommune,执行操作,写出一个 CSV 文件(将“final_kommune.csv”更改为“final_Albertslund.csv”),清除环境,然后继续下一个数据集,吓坏了。
Albertslund 和 Alleroed 只是示例,我总共需要处理 98 个数据集。
解决方法
也许下面的代码会有所帮助。未经测试,因为没有数据。
library(parallel)
library(dplyr)
library(data.table)
library(tidyr)
data_func <- function(kommune,inpath = "Adresser og distancer",turbines,outpath = "D:/Speciale/Analysedata") {
filename <- paste0(kommune,".csv")
filename <- file.path(inpath,filename)
#Load address and turbine datasets
distances <- fread(
file = filename,header = TRUE,sep = ",",colClasses = c("longitude" = "character","latitude" = "character","min_distance" = "character","distance_turbine" = "character","id_turbine" = "character"),encoding = "Latin-1"
)
#Some cleaning of the data and construction of new variables
#write out the dataset
outfile <- paste0("final_",kommune,".csv")
outfile <- file.path(outpath,outfile)
fwrite(mock_final,file = outfile,row.names = FALSE)
}
cluster = makeCluster(2)
setwd("D:\\Speciale")
# Read turbines file just once
turbines <- fread(
file = "turbines_DK.csv",sep=",colClasses = c("lon" = "character","lat" = "character","id_turbine" = "character","total_height" = "character","location" = "character"),encoding = "Latin-1"
)
kommune_vec <- c("Albertslund","Alleroed")
do.call(rbind,parLapply(cl = cluster,kommune_vec,data_func,turbines = turbines))