AdaCost算法
参考:《AdaCost Misclassification Cost-sensitive Boosting》
代价敏感:错分类的损失很大的样例。比如新冠肺炎本来是阳性但是被检测出阴性。
Cost-sensitive思想是一种符合实际应用的算法思想。在实际算法应用中,每种分类结果的错误分类代价是不一样的。同时,也可以延伸出每种分类结果正确分类的收益也是不一样的,所以基于此,需要对样本权重更新做一些额外的处理。
AdaCost对比AdaBoost
1. 目的
AdaBoost :最后结果偏向于容易错分类的样本
AdaCost :The final voted ensemble will also correctly predict more costly instances.(最后结果偏向于正确分类代价高的样例)
2. 权重更新规则
AdaBoost :At each round, AdaBoost increases the weights of wrongly classifified training instances and decreases those of correctly predicted instances(在每一epoch,AdaBoost增加错误分类的训练样本的权重,同时减少正确预测样本的权重)
AdaCost :In AdaCost, the weight updating rule increases the weights of costly wrong classifications more aggressively, but decreases the weights of costly correct classifications more conservatively(在 AdaCost 中,权重更新规则更激进地增加代价高昂的错误分类的权重,但更保守地降低代价高昂的正确分类的权重。通俗的说,对代价高昂的样本的奖励更少,但是惩罚更多)。
3. 权重初始化规则
AdaCost :代价更高的样本权重初始化一个更大的值
AdaBoost:等权重初始化或者标签数据量少的样本权重更大
AdaCost算法流程
算法流程中符号的含义:
S:样本空间 D:权重空间 beta:cost更新函数 H(x):生成的假设,预测结果
同时作者给出算法中的权重D更新的一种可替代计算方法:
详解AdaCost中的beta更新函数
本文章beta更新规则:we require β_(ci) to be non-decreasing with respect to ci, β+(ci) to be non-increasing, and both are non-negative.(预测为+1时,beta不增加;预测为-1时,beta不减小。而且beta是非负的值)。文章具体实验应用提到:We normalized each c_i to [0, 1] for all data sets. The cost adjustment function β is chosen as: β−© = 0.5 · c + 0.5 and β+© = −0.5 · c + 0.5.(其实beta函数的定义是根据实际问题来灵活定义的。但是总的思想一样:给代价高的样本更高的错误分类惩罚和更低的正确分类奖励)
其他两种beta更新规则:
Karakoulas and Shawe-Taylor: 如果y = +1 则 beta = 1; 如果y = -1 则 beta = v(v < 1)。
Ting and Zheng : 使用不同的错误损失,但是重复使用诱导模型。
(note:这两种更新规则在文章只是简单介绍。以后需要看原论文深入理解)
详解AdaCost中的alpha更新规则
For weak hypothesis h with range [-1,+1] and cost adjustment function β(i) in the range [0,+1], the choice of α is
AdaCost的算法实现
# -*- coding: utf-8 -*-
# @Use : AdaCost 算法实现(快速实现,未调试)
# @Time : 2022/5/30 22:30
# @FileName: adacost.py
# @Software: PyCharm
import numpy as np
from sklearn.preprocessing import MinMaxScaler
class AdaCost:
"""
使用代价敏感的思想改进AdaBoost算法---AdaCost。,目前实现的是二分类
"""
def __init__(self, T):
"""
@param T: 训练迭代次数
"""
self.T = T
def fit(self, x: np.array, y: np.array, costs: np.array, create_model):
"""
@param: train_x : 训练集
@param: costs : x的代价
@param: labels : 标签,目前标签是两类,输出假设是-1和1
@param: model : 学习器模型
"""
assert (x.shape[0] == costs.shape[0])
assert (x.shape[0] == y.shape[0])
sample_num = x.shape[0]
T = self.T
# initialize D(weights)
weights = []
betas = []
cost_sum = np.sum(costs)
for i in range(sample_num):
weights.append(costs[i] / cost_sum)
alpha_ts = []
model_ts = []
for t in range(T):
# build model
model = create_model()
# train weak learner
model.fit(x, y, weights=weights)
# compute weak hypothesis
h_t = model.predict_proba(x)[:, -1]
# 论文要求h_t 是 [-1,1]
h_t = MinMaxScaler(feature_range=[-1, 1]).fit_transform(h_t)
model_ts.append(model)
# update betas
betas = []
for i in range(sample_num):
beta_i = self.update_beta(np.sign(y[i] * h_t[i]), costs[i])
betas.append(beta_i)
# alpha_t
alpha_t = self.update_alpha(weights, y, h_t, betas)
alpha_ts.append(alpha_t)
Z_t = np.sum(weights)
# update weights[]
for i, weight in range(sample_num):
weights[i] = weights[i] * np.exp(-alpha_t * y[i] * h_t[i] * betas[i]) / Z_t
return model_ts, alpha_ts
def predict(self, x, models, alpha_ts):
"""
模型预测
@param: x 测试数据
@param: models: 模型
@param: alpha_ts:alpha 数组值
"""
# final hypothesis
assert (len(models) == len(alpha_ts))
T = self.T
f_sum = 0
for i in range(T):
h_t = models[i].predict_proba(x)[:, -1]
# 论文要求h_t 是 [-1,1]
h_t = MinMaxScaler(feature_range=[-1, 1]).fit_transform(h_t)
f_sum += alpha_ts[i] * h_t
h_final = np.sign(f_sum)
return h_final
@staticmethod
def update_beta(sign_value, cost):
"""
更新beta
we require β_(ci) to be non-decreasing with respect to ci,
β+(ci) to be non-increasing, and both are non-negative.
"""
assert (sign_value == 1 or sign_value == -1)
if sign_value == 1:
beta = -0.5 * cost + 0.5
return beta
elif sign_value == -1:
beta = 0.5 * cost + 0.5
return beta
@staticmethod
def update_alpha(weights: list, y: np.array, h_t: np.array, betas: list):
"""
更新alpha值。前提条件是要求:
For weak hypothesis h with range [-1,+1] and
cost adjustment function β(i) in the range [0,+1]
"""
assert (len(weights) == y.shape[0] and
len(weights) == h_t.shape[0] and
len(weights) == len(betas))
data_size = len(weights)
r = 0
for i in range(data_size):
r += weights[i] * y[i] * h_t[i] * betas[i]
alpha = 0.5 * np.log((1 + r) / (1 - r))
return alpha