R函数组成,用于替换数据帧中的值 使用词汇闭包概括任何谓词功能

如何解决R函数组成,用于替换数据帧中的值 使用词汇闭包概括任何谓词功能

给出以下可重复的示例

我的目标是在数据帧的相邻列中用NA行替换原始值;我知道这是一个已经发布的问题(有很多变体),但是我还没有找到我要实现的方法的解决方案:即通过应用函数组合

在可重现的示例中,驱动用原始值的NA替换的列为a列。

这是我到目前为止所做的

最后一个代码段是我实际上正在搜索的内容的失败尝试...

#-----------------------------------------------------------
# ifelse approach,it works but...
# it's error prone: i.e. copy and paste for all columns can introduce a lot of troubles

df<-data.frame(a=c(1,2,NA),b=c(3,NA,4),c=c(NA,5,6))
df

df$b<-ifelse(is.na(df$a),df$b)
df$c<-ifelse(is.na(df$a),df$c)

df

#--------------------------------------------------------
# extraction and subsitution approach
# same as above

df<-data.frame(a=c(1,6))
df

df$b[is.na(df$a)]<-NA
df$c[is.na(df$a)]<-NA

df

#----------------------------------------------------------
# definition of a function
# it's a bit better,but still error prone because of the copy and paste

df<-data.frame(a=c(1,6))
df

fix<-function(x,y){
  ifelse(is.na(x),y)
}

df$b<-fix(df$a,df$b)
df$c<-fix(df$a,df$c)

df

#------------------------------------------------------------
# this approach is not working as expected!
# the idea behind is of function composition;
# lapply does the fix to some columns of data frame

df<-data.frame(a=c(1,6))
df

fix2<-function(x){
  x[is.na(x[1])]<-NA
  x
}

df[]<-lapply(df,fix2)

df

此特定方法有帮助吗? 我坚持如何正确构思传递给lapply的替代函数

感谢

解决方法

尝试使用此功能,在输入中您拥有原始数据集,在输出中已清理的数据集:

输入

df<-data.frame(a=c(1,2,NA),b=c(3,NA,4),c=c(NA,5,6))
> df
   a  b  c
1  1  3 NA
2  2 NA  5
3 NA  4  6

功能

   fix<-function(df,var_x,list_y)
{
   df[is.na(df[,var_x]),list_y]<-NA
   return(df)
}

输出

fix(df,"a",c("b","c"))
   a  b  c
1  1  3 NA
2  2 NA  5
3 NA NA NA
,

使用词汇闭包

如果使用词法闭包-定义一个函数,该函数首先生成所需的函数。 然后,您可以根据需要使用此功能。

# given a column all other columns' values at that row should become NA
# if the driver column's value at that row is NA

# using lexical scoping of R function definitions,one can reach that.

df<-data.frame(a=c(1,6))
df

# whatever vector given,this vector's value should be changed
# according to first column's value

na_accustomizer <- function(df,driver_col) {
  ## Returns a function which will accustomize any vector/column
  ## to driver column's NAs
  function(vec) {
    vec[is.na(df[,driver_col])] <- NA
    vec
  }
}

df[] <- lapply(df,na_accustomizer(df,"a"))

df
##    a  b  c
## 1  1  3 NA
## 2  2 NA  5
## 3 NA NA NA

# 
# na_accustomizer(df,"a") returns
# 
#   function(vec) {
#     vec[is.na(df[,"a"])] <- NA
#     vec
#   }
# 
# which then can be used like you want:
# df[] <- lapply(df,na_accustomize(df,"a"))

使用常规功能

df<-data.frame(a=c(1,6))
df

# define it for one column
overtake_NA <- function(df,driver_col,target_col) {
  df[,target_col] <- ifelse(is.na(df[,driver_col]),df[,target_col])
  df
}

# define it for all columns of df
overtake_driver_col_NAs <- function(df,driver_col) {
  for (i in 1:ncol(df)) {
    df <- overtake_NA(df,i)
  }
  df
}

overtake_driver_col_NAs(df,"a")
#    a  b  c
# 1  1  3 NA
# 2  2 NA  5
# 3 NA NA NA

概括任何谓词功能

driver_col_to_other_cols <- function(df,pred) {
  ## overtake any value of the driver column to the other columns of df,## whenever predicate function (pred) is fulfilled.
  # define it for one column
  overtake_ <- function(df,target_col,pred) {
    selectors <- do.call(pred,list(df[,driver_col]))
    if (deparse(substitute(pred)) != "is.na") {
      # this is to 'recorrect' NA's which intrude into the selector vector
      # then driver_col has NAs. For sure "is.na" is not the only possible
      # way to check for NA - so this edge case is not covered fully
      selectors[is.na(selectors)] <- FALSE
    }
    df[,target_col] <- ifelse(selectors,driver_col],target_col])
    df
  }
  for (i in 1:ncol(df)) {
    df <- overtake_(df,i,pred)
  }
  df
}


driver_col_to_other_cols(df,function(x) x == 1)
#    a  b c
# 1  1  1 1
# 2  2 NA 5
# 3 NA  4 6

## if the "is.na" check is not done,then this would give
## (because of NA in selectorvector):
#    a  b  c
# 1  1  1  1
# 2  2 NA  5
# 3 NA NA NA
## hence in the case that pred doesn't check for NA in 'a',## these NA vlaues have to be reverted to the original columns' value.

driver_col_to_other_cols(df,is.na)
#    a  b  c
# 1  1  3 NA
# 2  2 NA  5
# 3 NA NA NA

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-