调参1——随机森林贝叶斯调参

2023-10-27

贝叶斯调参教程请参考:https://blog.csdn.net/weixin_35757704/article/details/118480135

安装贝叶斯调参:

pip install bayesian-optimization

算法简介

paper地址:http://papers.nips.cc/paper/4522-practical-bayesian%20-optimization-of-machine-learning-algorithms.pdf

Snoek, Jasper, Hugo Larochelle, and Ryan P. Adams. “Practical bayesian optimization of machine learning algorithms.” Advances in neural information processing systems 25 (2012).

随机森林是树模型的Bagging集成(Bagging集成可以参考:https://blog.csdn.net/weixin_35757704/article/details/119848453

在分类问题中,使用Gini系数作为分叉标准;基尼指数越大,说明不确定性就越大;基尼系数越小,不确定性越小。
在回归问题中,使用SE(就是MSE后两个字母SE)作为分叉标准

示例代码

这里我们对三个参数进行调参:

  • n_estimators
  • max_depth
  • max_leaf_nodes
from lightgbm import LGBMRegressor
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression
import numpy as np
from bayes_opt import BayesianOptimization
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestRegressor


def train_model(n_estimators, max_depth, max_leaf_nodes):
    # 模型训练
    try:
        model = RandomForestRegressor(
            n_estimators=int(n_estimators),
            max_depth=int(max_depth),
            max_leaf_nodes=int(max_leaf_nodes),
            n_jobs=4,  # 多核
        )
        model.fit(x_train, y_train)
        score = - mean_squared_error(y_test, model.predict(x_test))
        with open(param_save_file, 'a') as file:
            file.write("mse:{},n_estimators:{},max_depth:{},max_leaf_nodes:{}".format(
                score, n_estimators, max_depth, max_leaf_nodes
            ) + '\n')
        return score
    except Exception as e:
        return -1000000


if __name__ == '__main__':
    # 构造数据
    x, y = make_regression(n_samples=1000, n_features=5)
    param_save_file = "random_forest_param.txt"
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

    # 指定参数
    pbounds = {
        'n_estimators': (100, 1000),
        'max_depth': (18, 40),
        'max_leaf_nodes': (20, 200),
    }

    # 开始调优
    optimizer = BayesianOptimization(
        f=train_model,  # 黑盒目标函数
        pbounds=pbounds,  # 取值空间
        verbose=2,  # verbose = 2 时打印全部,verbose = 1 时打印运行中发现的最大值,verbose = 0 将什么都不打印
        random_state=1,
    )
    optimizer.maximize(  # 运行
        init_points=10,  # 随机搜索的步数
        n_iter=30,  # 执行贝叶斯优化迭代次数
    )
    with open(param_save_file, 'a') as file:
        file.write("optimizer_params: " + str(optimizer.max['params']) + " optimizer_target: " + str(
            optimizer.max['target']) + '\n')

参考文章

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

调参1——随机森林贝叶斯调参 的相关文章

随机推荐