如何解决与numpy ndarray的平均绝对偏差
我使用4D numpy数组,其中沿数组的第3维计算统计量mean,meadin,std
,如下所示:
import numpy as np
input_shape = (1,10,4)
n_sample =20
X = np.random.uniform(0,1,(n_sample,)+input_shape)
X.shape
(20,4)
然后我以这种方式计算mean,med,
和std-dev
:
sta_fuc = (np.mean,np.median,np.std)
stat = np.concatenate([func(X,axis=2,keepdims=True) for func in sta_fuc],axis=2)
因此:
stat.shape
(20,3,4)
代表沿该维度的mean,median
和std
的值。
但是随后我想添加列的平均绝对偏差mad
的值,以便统计量为(mean,median,std,mad
),但是看来numpy
没有提供功能为了那个原因。如何将mad
添加到我的统计信息中?
编辑
第一个答案,使用已定义的函数,即:
def mad(arr,axis=None,keepdims=True):
median = np.median(arr,axis=axis,keepdims=True)
mad = np.median(np.abs(arr-median,keepdims=keepdims),keepdims=keepdims)
return mad
然后将mad
添加到统计信息中,这会产生错误,如下所示:
sta_fuc = (np.mean,np.std,mad)
stat = np.concatenate([func(X,axis=2)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-dab51665f952> in <module>()
1 sta_fuc = (np.mean,mad)
----> 2 stat = np.concatenate([func(X,axis=2)
1 frames
<ipython-input-21-84d735c8c516> in mad(arr,axis,keepdims)
1 def mad(arr,keepdims=True):
2 median = np.median(arr,keepdims=True)
----> 3 mad = np.median(np.abs(arr-median,4 axis=axis,keepdims=keepdims)
5 return mad
TypeError: 'axis' is an invalid keyword to ufunc 'absolute'
EDIT-2
使用@Jussi建议的scipy
函数也会产生如下错误:
从scipy.stats中将mad_absolute_deviation导入为疯狂
sta_fuc = (np.mean,axis=2)
TypeError: median_absolute_deviation() got an unexpected keyword argument 'keepdims'
解决方法
通常,我看到MAD指的是中值绝对偏差。如果您要这样做,可以在SciPy库中以scipy.stats.median_absolute_deviation()
的形式获得。
自己编写合适的函数也很容易。
编辑:这是一个带有keepdims
参数的MAD函数:
def mad(data,axis=None,scale=1.4826,keepdims=False):
"""Median absolute deviation (MAD).
Defined as the median absolute deviation from the median of the data. A
robust alternative to stddev. Results should be identical to
scipy.stats.median_absolute_deviation(),which does not take a keepdims
argument.
Parameters
----------
data : array_like
The data.
scale : float,optional
Scaling of the result. By default,it is scaled to give a consistent
estimate of the standard deviation of values from a normal
distribution.
axis : numpy axis spec,optional
Axis or axes along which to compute MAD.
keepdims : bool,optional
If this is set to True,the axes which are reduced are left in the
result as dimensions with size one.
Returns
-------
ndarray
The MAD.
"""
# keep dims here so that broadcasting works
med = np.median(data,axis=axis,keepdims=True)
abs_devs = np.abs(data - med)
return scale * np.median(abs_devs,keepdims=keepdims)
,
我不知道使用numpy的内置解决方案。但是,您可以使用mad = median(abs(a - median(a)))
轻松地基于numpy函数实现它。
def mad(arr,keepdims=True):
median = np.median(arr,keepdims=True)
mad = np.median(np.abs(arr-median),keepdims=keepdims)
return mad
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。