Xgboost 未与校准分类器一起运行

如何解决Xgboost 未与校准分类器一起运行

我正在尝试使用校准分类器运行 XGboost，以下是我遇到错误的代码片段：

from sklearn.calibration import CalibratedClassifierCV
from xgboost import XGBClassifier
import numpy as np

x_train =np.array([1,2,3,4,5,6,10,]).reshape(-1,1)
y_train = np.array([1,1,3])

x_cfl=XGBClassifier(n_estimators=1)
x_cfl.fit(x_train,y_train)
sig_clf = CalibratedClassifierCV(x_cfl,method="sigmoid")
sig_clf.fit(x_train,y_train)

错误：

TypeError: predict_proba() got an unexpected keyword argument 'X'"

完整跟踪：

TypeError                                Traceback (most recent call last)
<ipython-input-48-08dd0b4ae8aa> in <module>
----> 1 sig_clf.fit(x_train,y_train)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in fit(self,X,y,sample_weight)
    309                 parallel = Parallel(n_jobs=self.n_jobs)
    310 
--> 311                 self.calibrated_classifiers_ = parallel(
    312                     delayed(_fit_classifier_calibrator_pair)(
    313                         clone(base_estimator),train=train,test=test,~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self,iterable)
   1039             # remaining jobs.
   1040             self._iterating = False
-> 1041             if self.dispatch_one_batch(iterator):
   1042                 self._iterating = self._original_iterator is not None
   1043 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self,iterator)
    857                 return False
    858             else:
--> 859                 self._dispatch(tasks)
    860                 return True
    861 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self,batch)
    775         with self._lock:
    776             job_idx = len(self._jobs)
--> 777             job = self._backend.apply_async(batch,callback=cb)
    778             # A job can complete so quickly than its callback is
    779             # called before we get here,causing self._jobs to

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self,func,callback)
    206     def apply_async(self,callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self,batch)
    570         # Don't delay the application,to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend,n_jobs=self._n_jobs):
--> 262             return [func(*args,**kwargs)
    263                     for func,args,kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend,kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/fixes.py in __call__(self,*args,**kwargs)
    220     def __call__(self,**kwargs):
    221         with config_context(**self.config):
--> 222             return self.function(*args,**kwargs)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _fit_classifier_calibrator_pair(estimator,train,test,supports_sw,method,classes,sample_weight)
    443     n_classes = len(classes)
    444     pred_method = _get_prediction_method(estimator)
--> 445     predictions = _compute_predictions(pred_method,X[test],n_classes)
    446 
    447     sw = None if sample_weight is None else sample_weight[test]

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _compute_predictions(pred_method,n_classes)
    499         (X.shape[0],1).
    500     """
--> 501     predictions = pred_method(X=X)
    502     if hasattr(pred_method,'__name__'):
    503         method_name = pred_method.__name__

TypeError: predict_proba() got an unexpected keyword argument 'X'

我对此感到非常惊讶，因为它一直在为我运行直到昨天，当我使用其他分类器时也在运行相同的代码。

from sklearn.calibration import CalibratedClassifierCV
from xgboost import XGBClassifier
import numpy as np

x_train = np.array([1,3])


x_cfl=LGBMClassifier(n_estimators=1)
x_cfl.fit(x_train,y_train)

输出：

CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))

我的 Xgboost 安装有问题吗？？我使用 conda 进行安装，最后我记得我昨天卸载了 xgboost 并重新安装了它。

我的 xgboost 版本：

1.3.0

解决方法

我相信问题来自 XGBoost。解释如下：https://github.com/dmlc/xgboost/pull/6555

XGBoost 定义：

predict_proba(self,data,...

代替：

predict_proba(self,X,...

由于 sklearn 0.24 调用 clf.predict_proba(X=X)，因此会引发异常。

这里有一个在不更改包版本的情况下解决问题的想法：创建一个继承 XGBoostClassifier 的类以使用正确的参数名称覆盖 predict_proba 并调用 super()。

现在已经修复了，scikit-learn=0.24 好像有一个bug

我降级到 0.22.2.post1 并且修复了！

Xgboost 未与校准分类器一起运行

如何解决Xgboost 未与校准分类器一起运行

解决方法

相关推荐