如何在 GridSearchCV 中正确配置多指标评分（scoring）参数

2026-01-13

如何在 GridSearchCV 中正确配置多指标评分（scoring）参数

gridsearchcv 的 `scoring` 参数不支持 python 集合（set），必须使用列表、元组、字典或可调用对象；将 `{‘precision’,’f1′,’recall’,’accuracy’}` 改为 `[‘precision’, ‘f1’, ‘recall’, ‘accuracy’]` 即可解决 invalidparametererror。

在使用 sklearn.model_selection.GridSearchCV 进行超参数调优时，若需同时评估多个模型性能指标（如准确率、精确率、召回率和 F1 分数），常会误将 scoring 参数设为 Python 集合（如 {‘precision’, ‘f1’, ‘recall’, ‘accuracy’}）。这是错误的——因为 scoring 不接受 set 类型，即使其中元素均为合法字符串，也会触发 InvalidParameterError。

✅ 正确做法是使用列表（list）或元组（tuple）：

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1000, n_features=4, n_classes=2, random_state=42)
clf = RandomForestClassifier(random_state=42)

# ✅ 正确：使用列表（推荐）
grid_search = GridSearchCV(
    estimator=clf,
    param_grid={'n_estimators': [50, 100]},
    scoring=['accuracy', 'precision', 'recall', 'f1'],  # ← 注意：方括号，非花括号
    cv=3,
    refit='f1'  # 必须指定一个主优化指标（refit 不能为 list/dict）
)
grid_search.fit(X, y)

⚠️ 关键注意事项：

Petalica Paint

用AI为你的画自动上色！

下载

scoring 接收 list 或 tuple 时，GridSearchCV 会为每个指标单独计算并返回 cv_results_ 中对应字段（如 ‘mean_test_accuracy’, ‘mean_test_f1’ 等）；
refit 参数必须指定为单一字符串（如 ‘f1’ 或 ‘accuracy’），用于最终选择最优参数组合所依据的指标；不可设为列表或 None（除非你后续不调用 .best_estimator_）；
若需自定义组合逻辑（例如加权得分），可传入返回字典的可调用对象：

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

def multi_scorer(estimator, X, y):
    y_pred = estimator.predict(X)
    return {
        'accuracy': accuracy_score(y, y_pred),
        'precision': precision_score(y, y_pred, zero_division=0),
        'recall': recall_score(y, y_pred, zero_division=0),
        'f1': f1_score(y, y_pred, zero_division=0)
    }

grid_search = GridSearchCV(clf, param_grid, scoring=multi_scorer, refit='f1')

? 进阶技巧：使用 make_scorer 构建带参数的评分器（如 average=’macro’）可提升灵活性：

from sklearn.metrics import make_scorer, f1_score
scoring = {
    'f1_macro': make_scorer(f1_score, average='macro'),
    'accuracy': 'accuracy'
}
grid_search = GridSearchCV(clf, param_grid, scoring=scoring, refit='f1_macro')

总结：始终牢记 scoring 的合法类型是 str / callable / list / tuple / dict —— 集合（set）不在其列。优先使用列表形式实现多指标评估，并合理设置 refit 以确保模型可部署。

https://www.php.cn/faq/1976432.html

如何在 GridSearchCV 中正确配置多指标评分（scoring）参数

发表回复 取消回复

发表回复取消回复