如何解决ValueError:“ mean_squared_error”不是有效的评分值
因此,我一直在从事我的第一个ML项目,并且作为其中一部分,我正在尝试从sci-kit学习中使用各种模型,并且我为随机森林模型编写了这段代码:
#Random Forest
reg = RandomForestRegressor(random_state=0,criterion = 'mse')
#Apply grid search for best parameters
params = {'randomforestregressor__n_estimators' : range(100,500,200),'randomforestregressor__min_samples_split' : range(2,10,3)}
pipe = make_pipeline(reg)
grid = GridSearchCV(pipe,param_grid = params,scoring='mean_squared_error',n_jobs=-1,iid=False,cv=5)
reg = grid.fit(X_train,y_train)
print('Best MSE: ',grid.best_score_)
print('Best Parameters: ',grid.best_estimator_)
y_train_pred = reg.predict(X_train)
y_test_pred = reg.predict(X_test)
tr_err = mean_squared_error(y_train_pred,y_train)
ts_err = mean_squared_error(y_test_pred,y_test)
print(tr_err,ts_err)
results_train['random_forest'] = tr_err
results_test['random_forest'] = ts_err
但是,当我运行此代码时,出现以下错误:
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in get_scorer(scoring)
359 else:
--> 360 scorer = SCORERS[scoring]
361 except KeyError:
KeyError: 'mean_squared_error'
During handling of the above exception,another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-149-394cd9e0c273> in <module>
5 pipe = make_pipeline(reg)
6 grid = GridSearchCV(pipe,cv=5)
----> 7 reg = grid.fit(X_train,y_train)
8 print('Best MSE: ',grid.best_score_)
9 print('Best Parameters: ',grid.best_estimator_)
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args,**kwargs)
71 FutureWarning)
72 kwargs.update({k: arg for k,arg in zip(sig.parameters,args)})
---> 73 return f(**kwargs)
74 return inner_f
75
~\anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self,X,y,groups,**fit_params)
652 cv = check_cv(self.cv,classifier=is_classifier(estimator))
653
--> 654 scorers,self.multimetric_ = _check_multimetric_scoring(
655 self.estimator,scoring=self.scoring)
656
~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in _check_multimetric_scoring(estimator,scoring)
473 if callable(scoring) or scoring is None or isinstance(scoring,474 str):
--> 475 scorers = {"score": check_scoring(estimator,scoring=scoring)}
476 return scorers,False
477 else:
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args,args)})
---> 73 return f(**kwargs)
74 return inner_f
75
~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in check_scoring(estimator,scoring,allow_none)
403 "'fit' method,%r was passed" % estimator)
404 if isinstance(scoring,str):
--> 405 return get_scorer(scoring)
406 elif callable(scoring):
407 # Heuristic to ensure user has not passed a metric
~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in get_scorer(scoring)
360 scorer = SCORERS[scoring]
361 except KeyError:
--> 362 raise ValueError('%r is not a valid scoring value. '
363 'Use sorted(sklearn.metrics.SCORERS.keys()) '
364 'to get valid options.' % scoring)
ValueError: 'mean_squared_error' is not a valid scoring value. Use sorted(sklearn.metrics.SCORERS.keys()) to get valid options.
因此,我尝试通过从scoring='mean_squared_error'
中删除GridSearchCV(pipe,cv=5)
来运行它。当我这样做时,代码可以完美运行,并给出足够好的训练和测试错误。
无论如何,我不知道为什么在scoring='mean_squared_error'
函数中使用GridSearchCV
参数会引发该错误。我在做什么错了?
解决方法
所有计分器对象均遵循以下约定:较高的返回值比较低的返回值更好。因此,用于度量模型与数据之间距离的度量(如
metrics.mean_squared_error
)可以作为 neg_mean_squared_error 获得,该度量返回度量的取反值。
这意味着您必须通过scoring='neg_mean_squared_error'
才能使用均方误差评估网格搜索结果。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。