Tensorflow pytorch及paddle交叉熵损失函数类标签及label smooth配置方法

2023-05-16

交叉熵损失函数类标签及label smooth配置方法

  • 1 无class weight 无label smooth
    • 1.1 pytorch 输出
    • 1.2 paddle 输出
    • 1.3 tensorflow 输出
  • 2 有label smooth 没有class weight
    • 2.1 pytorch 输出
    • 2.1 paddle输出
    • 2.3 tensorflow 输出
  • 3 有class weight 无label smooth
    • 3.1 pytorch 输出
    • 3.2 paddle 输出
    • 3.3 tensorflow 输出
  • 4 即有class weight 又有 label smooth
    • 4.1 pytorch 输出
    • 4.2 paddle 输出
    • 4.3 tensorflow输出
  • 5 tensorflow的 class weight实现
    • 5.1 WeightedCategoricalCrossentropy1 结果
    • 5.2 WeightedCategoricalCrossentropy2 结果
    • 5.3 总结

这篇文章主要是总结在使用不同深度学习框架使用分类交叉熵损失函数的一些经验和方法总结。不同框架分类模型所用的损失函数在使用label smooth和class weight时的配置方法。label smooth 是一种 regularization, class weight 则是在样本不均衡时,处理unbalanced data的一种法。以下将依次写明这些方法。

首先,导入必备的库及说明算法实现所用版本。

import torch 
import torch.nn as tnn
import paddle 
import paddle.nn as pnn
import copy
import numpy as np
import tensorflow as tf 
print(torch.__version__)
print(paddle.__version__)
print(tf.__version__)
1.13.0+cpu
2.4.0
2.11.0

以上版本都是当前(2022.12.9)最新版本。实验是关注于损失函数的,所以输入最提前设定,两个样本,三分类的,非多标签。

1 无class weight 无label smooth

y_true = np.array([1,2],dtype=np.int64) # class index
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]],dtype=np.float32) # pred

1.1 pytorch 输出

y_true_torch = torch.from_numpy(y_true)
y_pred_torch = torch.from_numpy(y_pred)
ce_torch = tnn.CrossEntropyLoss(weight=None,label_smoothing=0.0,reduction="mean")
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss:",loss_torch.numpy())  
ce_torch = tnn.CrossEntropyLoss(weight=None,label_smoothing=0.0,reduction='none')
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss separate:", loss_torch.numpy())  
torch loss: 0.9868951
torch loss separate: [0.5840635 1.3897266]

1.2 paddle 输出

y_true_paddle = paddle.to_tensor(y_true)
y_pred_paddle = paddle.to_tensor(y_pred)
y_true_paddle = pnn.functional.one_hot(y_true_paddle,num_classes=3)
ce_paddle = pnn.CrossEntropyLoss(weight=None,soft_label=True,reduction='mean')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss:",loss_paddle.numpy())  
ce_paddle = pnn.CrossEntropyLoss(weight=None,soft_label=True,reduction='none')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss separate:",loss_paddle.numpy())  
paddle loss: [0.9868951]
paddle loss separate: [[0.58406353]
 [1.3897266 ]]

1.3 tensorflow 输出

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
ce_tf = tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=None)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=None)
print("tensorflow loss separate:",loss_tf.numpy()) 
tensorflow loss: 0.9868951
tensorflow loss separate: [0.5840635 1.3897266]

从以上代码输出可以看到,这三个框架的输出结果是保持一致的

2 有label smooth 没有class weight

y_true = np.array([1,2],dtype=np.int64) # class index
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]],dtype=np.float32) # pred

2.1 pytorch 输出

y_true_torch = torch.from_numpy(y_true)
y_pred_torch = torch.from_numpy(y_pred)
ce_torch = tnn.CrossEntropyLoss(weight=None,label_smoothing=0.1,reduction="mean")
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss:",loss_torch.numpy())  
ce_torch = tnn.CrossEntropyLoss(weight=None,label_smoothing=0.1,reduction='none')
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss separate:", loss_torch.numpy())  
torch loss: 1.0060617
torch loss separate: [0.64573014 1.3663933 ]

2.1 paddle输出

y_true_paddle = paddle.to_tensor(y_true)
y_pred_paddle = paddle.to_tensor(y_pred)
y_true_paddle = pnn.functional.one_hot(y_true_paddle,num_classes=3)
y_true_paddle = pnn.functional.label_smooth(y_true_paddle,epsilon=0.1)
ce_paddle = pnn.CrossEntropyLoss(weight=None,soft_label=True,reduction='mean')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss:",loss_paddle.numpy())  
ce_paddle = pnn.CrossEntropyLoss(weight=None,soft_label=True,reduction='none')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss separate:",loss_paddle.numpy())  
paddle loss: [1.0060618]
paddle loss separate: [[0.6457302]
 [1.3663934]]

2.3 tensorflow 输出

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
ce_tf = tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=None)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=None)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 1.0060618
tensorflow loss separate: [0.64573014 1.3663933 ]

从以上代码可以看出,三个框架的输出精度保持一致

3 有class weight 无label smooth

从代码角度来看,无label smooth不过是把label smooth设置成0.0

y_true = np.array([1,2],dtype=np.int64) # class index
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]],dtype=np.float32) # pred

3.1 pytorch 输出

y_true_torch = torch.from_numpy(y_true)
y_pred_torch = torch.from_numpy(y_pred)
weight_torch = torch.from_numpy(np.array([1,2,3],dtype=np.float32))
ce_torch = tnn.CrossEntropyLoss(weight=weight_torch,label_smoothing=0.0,reduction="mean")
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss:",loss_torch.numpy())  
ce_torch = tnn.CrossEntropyLoss(weight=weight_torch,label_smoothing=0.0,reduction='none')
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss separate:", loss_torch.numpy())  
torch loss: 1.0674614
torch loss separate: [1.168127 4.16918 ]

3.2 paddle 输出

y_true_paddle = paddle.to_tensor(y_true)
y_pred_paddle = paddle.to_tensor(y_pred)
weight_paddle = paddle.to_tensor(np.array([1,2,3],dtype=np.float32))
y_true_paddle = pnn.functional.one_hot(y_true_paddle,num_classes=3)
y_true_paddle = pnn.functional.label_smooth(y_true_paddle,epsilon=0.0)
ce_paddle = pnn.CrossEntropyLoss(weight=weight_paddle,soft_label=True,reduction='mean')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss:",loss_paddle.numpy())  
ce_paddle = pnn.CrossEntropyLoss(weight=weight_paddle,soft_label=True,reduction='none')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss separate:",loss_paddle.numpy()) 
paddle loss: [1.0674614]
paddle loss separate: [[1.1681271]
 [4.16918  ]]

3.3 tensorflow 输出

这部分,由于tensorflow没有class weight这个功能,所以需要将class weight转换成sample weight

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
weight_tf= tf.constant([1.,2.,3.])
weights = tf.reduce_sum(class_weight * y_true_tf, axis=-1)  #这部分就是class weight 转sample weight
ce_tf = tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 2.6686535
tensorflow loss separate: [1.168127 4.16918 ]

可以看出tensorflow 在reduction为none时输出和其它两个框架是完全一样的,而paddle和pytorch是完全相同的。tensorflow的差异在于求平均过程,tensorflow是除以
sample的个数,而其它两个框架是除以权重和得到的,如下

print("tensorflow output:",tf.reduce_mean(loss_tf).numpy())
print("pytorch and paddle output:",(tf.reduce_sum(loss_tf)/tf.reduce_sum(weights)).numpy())
tensorflow output: 2.6686535
pytorch and paddle output: 1.0674614

4 即有class weight 又有 label smooth

y_true = np.array([1,2],dtype=np.int64) # class index
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]],dtype=np.float32) # pred

4.1 pytorch 输出

y_true_torch = torch.from_numpy(y_true)
y_pred_torch = torch.from_numpy(y_pred)
weight_torch = torch.from_numpy(np.array([1,2,3],dtype=np.float32))
ce_torch = tnn.CrossEntropyLoss(weight=weight_torch,label_smoothing=0.1,reduction="mean")
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss:",loss_torch.numpy())  
ce_torch = tnn.CrossEntropyLoss(weight=weight_torch,label_smoothing=0.1,reduction='none')
loss_torch=ce_torch(y_pred_torch,y_true_torch)
print("torch loss separate:", loss_torch.numpy())  
torch loss: 1.0553335
torch loss separate: [1.293127  3.9835405]

4.2 paddle 输出

y_true_paddle = paddle.to_tensor(y_true)
y_pred_paddle = paddle.to_tensor(y_pred)
weight_paddle = paddle.to_tensor(np.array([1,2,3],dtype=np.float32))
y_true_paddle = pnn.functional.one_hot(y_true_paddle,num_classes=3)
y_true_paddle = pnn.functional.label_smooth(y_true_paddle,epsilon=0.1)
ce_paddle = pnn.CrossEntropyLoss(weight=weight_paddle,soft_label=True,reduction='mean')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss:",loss_paddle.numpy())  
ce_paddle = pnn.CrossEntropyLoss(weight=weight_paddle,soft_label=True,reduction='none')
loss_paddle=ce_paddle(y_pred_paddle,y_true_paddle)
print("paddle loss separate:",loss_paddle.numpy()) 
paddle loss: [1.0722452]
paddle loss separate: [[1.2914604]
 [3.9625409]]


d:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\dygraph\math_op_patch.py:275: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.bool, the right dtype will convert to paddle.float32
  warnings.warn(

4.3 tensorflow输出

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
weight_tf= tf.constant([1.,2.,3.])
weights = tf.reduce_sum(weight_tf * y_true_tf, axis=-1)  #这部分就是class weight 转sample weight
ce_tf = tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 2.6953201
tensorflow loss separate: [1.2914603 4.09918  ]

参照3.3 我们输出一下

print("tensorflow output:",tf.reduce_mean(loss_tf).numpy())
print("weight output:",(tf.reduce_sum(loss_tf)/tf.reduce_sum(weights)).numpy())
tensorflow output: 2.6953201
weight output: 1.0781281

从以上的结果看到三个框架的结果都是不同的,对于pytorch的结果分析,看了官方文档的计算公式,还没太明白,也没看到源码,这里就不多述,以后搞明白原因就补上。
关于paddle和tenssorflow的结果倒是有理解,接着看下一部分的代码。

5 tensorflow的 class weight实现

就个人理解而言,我更加认可4.3部分,在同时有class weight 和label smooth的结果是正确的,而结果可以是求均值,也可以是按权值的均值。我更倾向于平均值。
当然没有好的理论支撑。

y_true = np.array([1,2],dtype=np.int64) # class index
y_pred = np.array([[0.05, 0.95, 0], [0.1, 0.8, 0.1]],dtype=np.float32) # pred


# 对于tensorflow 需要自己写一个带类权重的函数
# @tf.keras.utils.register_keras_serializable(package="weightedcrossentropyloss")
class WeightedCategoricalCrossentropy1(tf.keras.losses.Loss):
    """Implements WeightedCategoricalCrossentropy.
    
    Args:
        class_weight: a manual rescaling weight given to each class. If given, has to be a Tensor of size C
        from_logits: Whether y_pred is expected to be a logits tensor. By default, we assume that y_pred encodes a probability distribution.
        label_smoothing:Float in [0, 1]. When > 0, label values are smoothed, meaning the confidence on label values are relaxed. For example, 
        if 0.1, use 0.1 / num_classes for non-target labels and 0.9 + 0.1 / num_classes for target labels.
    """

    def __init__(self,class_weight=None,from_logits=True,label_smoothing=0.0, reduction='mean',**kwargs):
        super().__init__(**kwargs)
        self.label_smoothing = label_smoothing
        self.from_logits=from_logits
        self.class_weight=class_weight
        self.reduction=reduction


    def _labelsmoothing(self, y_true, class_num):
        if len(y_true.shape) == 1 or y_true.shape[-1] != class_num:
            raise Exception("Please use one hot label")
        y_true = y_true*(1 - self.label_smoothing)+self.label_smoothing / class_num

        return y_true

    def call(self, y_true, y_pred):
        y_pred = tf.convert_to_tensor(y_pred)
        y_true = tf.cast(y_true, y_pred.dtype)
        
        class_num = y_pred.shape[-1]
        if not self.class_weight is None:
            weights = tf.reduce_sum(self.class_weight * y_true, axis=-1)
        if self.label_smoothing:
            y_true = self._labelsmoothing(y_true,class_num)
        # weights = tf.reduce_sum(self.class_weight * y_true, axis=-1)
        if self.from_logits:
            y_pred = -tf.nn.log_softmax(y_pred, axis=-1)
        else:
            y_pred = -tf.math.log(y_pred, axis=-1)
            
        loss = tf.reduce_sum(y_pred * y_true, axis=-1)
        if not self.class_weight is None:
            loss = loss * weights
        if self.reduction==tf.keras.losses.Reduction.AUTO:
            loss = tf.reduce_mean(loss, axis=-1)
        elif self.reduction==tf.keras.losses.Reduction.SUM:
            loss = tf.reduce_mean(loss, axis=-1)
        else:
            pass
        return loss
    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "class_weight": self.class_weight,
                "from_logits": self.from_logits,
                "label_smoothing": self.label_smoothing,
            }
        )
        return config
class WeightedCategoricalCrossentropy2(tf.keras.losses.Loss):
    """Implements WeightedCategoricalCrossentropy.
    
    Args:
        class_weight: a manual rescaling weight given to each class. If given, has to be a Tensor of size C
        from_logits: Whether y_pred is expected to be a logits tensor. By default, we assume that y_pred encodes a probability distribution.
        label_smoothing:Float in [0, 1]. When > 0, label values are smoothed, meaning the confidence on label values are relaxed. For example, 
        if 0.1, use 0.1 / num_classes for non-target labels and 0.9 + 0.1 / num_classes for target labels.
    """

    def __init__(self,class_weight=None,from_logits=True,label_smoothing=0.0, reduction='mean',**kwargs):
        super().__init__(**kwargs)
        self.label_smoothing = label_smoothing
        self.from_logits=from_logits
        self.class_weight=class_weight
        self.reduction=reduction


    def _labelsmoothing(self, y_true, class_num):
        if len(y_true.shape) == 1 or y_true.shape[-1] != class_num:
            raise Exception("Please use one hot label")
        y_true = y_true*(1 - self.label_smoothing)+self.label_smoothing / class_num

        return y_true

    def call(self, y_true, y_pred):
        y_pred = tf.convert_to_tensor(y_pred)
        y_true = tf.cast(y_true, y_pred.dtype)
        
        class_num = y_pred.shape[-1]
        # weights = tf.reduce_sum(self.class_weight * y_true, axis=-1)
        if self.label_smoothing:
            y_true = self._labelsmoothing(y_true,class_num)
        if not self.class_weight is None:
            weights = tf.reduce_sum(self.class_weight * y_true, axis=-1)
        if self.from_logits:
            y_pred = -tf.nn.log_softmax(y_pred, axis=-1)
        else:
            y_pred = -tf.math.log(y_pred, axis=-1)
            
        loss = tf.reduce_sum(y_pred * y_true, axis=-1)
        if not self.class_weight is None:
            loss = loss * weights
        if self.reduction==tf.keras.losses.Reduction.AUTO:
            loss = tf.reduce_mean(loss, axis=-1)
        elif self.reduction==tf.keras.losses.Reduction.SUM:
            loss = tf.reduce_mean(loss, axis=-1)
        else:
            pass
        return loss
    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "class_weight": self.class_weight,
                "from_logits": self.from_logits,
                "label_smoothing": self.label_smoothing,
            }
        )
        return config

以上代码中,可以看到,这两部分代码只有一个小部分的区别,就是weights计算的位置,一个在label smooth 之前,另一个在label smooth之后,现在看一下结果

补充一点:

if self.from_logits:
    y_pred = -tf.nn.log_softmax(y_pred, axis=-1)
else:
    y_pred = -tf.math.log(y_pred, axis=-1)
    
loss = tf.reduce_sum(y_pred * y_true, axis=-1)

以上几行代码可以用:

 tf.nn.softmax_cross_entropy_with_logits(y_true,y_pred)

来代替的。

5.1 WeightedCategoricalCrossentropy1 结果

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
weight_tf = tf.constant([1.,2.,3.])
ce_tf = WeightedCategoricalCrossentropy1(class_weight=weight_tf,from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= WeightedCategoricalCrossentropy1(class_weight=weight_tf,from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 2.6953201
tensorflow loss separate: [1.2914603 4.09918  ]
y_true_tf = tf.convert_to_tensor(y_true)
y_true_tf = tf.one_hot(y_true_tf,3)
weight_tf = tf.constant([1.,2.,3.])
weights = tf.reduce_sum(weight_tf * y_true_tf, axis=-1)
print("weights:",weights)
print("tensorflow output:",tf.reduce_mean(loss_tf).numpy())
print("weight output:",(tf.reduce_sum(loss_tf)/tf.reduce_sum(weights)).numpy())
weights: tf.Tensor([2. 3.], shape=(2,), dtype=float32)
tensorflow output: 2.6953201
weight output: 1.0781281

可以看到这个结果与4.3结果是相同的,也就是说tensorflow 结果是在label smooth之前计算出class weight的

5.2 WeightedCategoricalCrossentropy2 结果

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
weight_tf = tf.constant([1.,2.,3.])
ce_tf = WeightedCategoricalCrossentropy2(class_weight=weight_tf,from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= WeightedCategoricalCrossentropy2(class_weight=weight_tf,from_logits=True,label_smoothing=0.1,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 2.6270003
tensorflow loss separate: [1.2914603 3.9625404]
y_true_tf = tf.convert_to_tensor(y_true)
y_true_tf = tf.one_hot(y_true_tf,3)
y_true_tf = y_true_tf*(1 - 0.1)+0.1 / 3. # label smooth
weight_tf = tf.constant([1.,2.,3.])
weights = tf.reduce_sum(weight_tf * y_true_tf, axis=-1)

print("weights:",weights)
print("tensorflow output:",tf.reduce_mean(loss_tf).numpy())
print("weight output:",(tf.reduce_sum(loss_tf)/tf.reduce_sum(weights)).numpy())
weights: tf.Tensor([2.        2.8999999], shape=(2,), dtype=float32)
tensorflow output: 2.6270003
weight output: 1.0722451

可以看出,这个结果与4.2 paddle的结果是相同的,也就是说class weight 是在label smooth 之后计算的。

5.3 总结

以上自定义的两种class weight的方法都可以用官方的tf.keras.losses.CategoricalCrossentropy来实现的,自定义的第一种方法与4.3相同,自定义的第二种如下,只要改一下weight就行

y_true_tf = tf.convert_to_tensor(y_true)
y_pred_tf = tf.convert_to_tensor(y_pred)
y_true_tf = tf.one_hot(y_true_tf,3)
y_true_tf = y_true_tf*(1 - 0.1)+0.1 / 3. # label smooth
weight_tf= tf.constant([1.,2.,3.])
weights = tf.reduce_sum(weight_tf * y_true_tf, axis=-1)  #这部分就是class weight 转sample weight
ce_tf = tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.AUTO)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss:",loss_tf.numpy())  
ce_tf= tf.keras.losses.CategoricalCrossentropy(from_logits=True,label_smoothing=0.0,reduction=tf.keras.losses.Reduction.NONE)
loss_tf=ce_tf(y_true_tf,y_pred_tf,sample_weight=weights)
print("tensorflow loss separate:",loss_tf.numpy())
tensorflow loss: 2.6270003
tensorflow loss separate: [1.2914603 3.9625404]
print("weights:",weights)
print("tensorflow output:",tf.reduce_mean(loss_tf).numpy())
print("weight output:",(tf.reduce_sum(loss_tf)/tf.reduce_sum(weights)).numpy())
weights: tf.Tensor([2.        2.8999999], shape=(2,), dtype=float32)
tensorflow output: 2.6270003
weight output: 1.0722451

可以看得出,结果是对的上的,重点理解那两个自定义的class weight 就可以明白 tensorflow 和paddle的计算结果。

以上全文第5节,可以理解有一个unweighted loss 和一个 weight,最后生成weighted lossm

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Tensorflow pytorch及paddle交叉熵损失函数类标签及label smooth配置方法 的相关文章

随机推荐

  • 免费获取论文全文的方法,SCI-HUB的使用教程

    很多人不在学校期间需要看文献全文 xff0c 很多人获取文章的方式或是在网上求助或是给原作者索要 在SCI HUB出现后 xff0c 这些麻烦都不需要 SCI PUB上保存了超过了4700万篇科研文献 SCI PUB的网址 使用方法 xff
  • Leetcode: Palindrome Partitioning I & II

    Problem I Palindrome Partitioning I Given a string s partition s such that every substring of the partition is a palindr
  • Hadoop 知识点总结——HDFS读流程和写流程

    HDFS读流程和写流程 前言HDFS的读数据流程HDFS的写数据流程 大家好 xff0c 我是风云 xff0c 欢迎大家关注我的博客 笑看风云路 xff0c 在未来的日子里我们一起来学习大数据相关的技术 xff0c 一起努力奋斗 xff0c
  • Mapreduce实例(五):二次排序

    MR 实现 二次排序 实现思路代码实现自定义key的代码 xff1a 分区函数类代码分组函数类代码Map代码 xff1a Reduce代码 xff1a 完整代码 xff1a 大家好 xff0c 我是风云 xff0c 欢迎大家关注我的博客 或
  • SpringSecurityOAuth2 登录 Miss grant type问题

    SpringSecurityOAuth2 登录传值的时候会出现 Miss grant type问题 解决方式为 xff1a 在header 加上 39 Content Type 39 39 application x www form ur
  • 【思维导图】消息中间件

  • su: 警告:无法切换到目录/home/oracle: 权限不够-bash: /home/oracle/.bash_profile: 权限不够

    在使用linux服务器时 xff0c 通过root用户切换到oracle时报错 报错如下图 xff1a 原因 xff1a 权限不足 xff01 解决 xff1a 查看oracle用户情况 xff1a id oracle 查看目录权限 xff
  • 学生时代的书单

    大话系列的书 xff0c 用独特的行文风格 xff0c 以风趣 幽默的语言向读者讲述 概念原理知识 xff0c 用漫画式的插图帮助读者理解晦涩 枯燥的技术 xff0c 让我们在快乐中掌握知识 xff01 1 大话通信 通信基础知识读本 作者
  • tensorflow配置只使用CPU

    文章目录 1 方法一2 方法二 有些场景下 xff0c 比如GPU版本运行失败或其它原因 xff0c 需要强制tensorflow使用CPU xff0c 这里提供两种方法 xff0c 仅针对tensorflow2 1 方法一 span cl
  • 不同框架实现LSTM代码及转Onnx方法

    文章目录 1 Paddle 生成LSTM1 1 time major 61 False1 2 time major 61 True1 3 sequence lens1 4 无初始状态1 5 查看生成的onnx模型 2 pytorch 生成L
  • 从目标检测数据集中扣出所需类别进行分类

    文章目录 1 获取VOC数据集中两轮车2 接着做COCO数据集的分类数据获取3 YOLO 格式数据4 openimage数据获取获取标签根据displayname 获取 labelname 并指定我们想要的类别根据标签名找到对应的图片名称测
  • keras_cv进行数据增强

    使用keras cv来做分类数据增强 以下直接上流程 xff0c 具体的原理和代码上github查看源码及配合tensorflow官网及keras官网来做处理 当前 xff08 2022 10 8 这些文档还不是很全 span class
  • paddle的安装

    安装 1 安装paddle2 安装nccl3 验证 这次安装主要使用conda xff0c 可以有更好的安装体验 关于框架 xff0c 常用的tensorflow pytorch 但是国产的paddle也做的越来越好 xff0c 而且学习资
  • 去掉文件名中的特殊符号及中文

    文章目录 做深度学习算法收集数据时 xff0c 来源各种各样 xff0c 导至文件名混有各种特殊符号 xff0c 所在这里有一段代码 xff0c 可以把文件名进行处理 xff0c 只保留数字 字母和下划线 xff0c 然后对文件进行重命名
  • pip常用命令

    文章目录 看了一篇介绍pip的 xff0c 记录在这里 https mp weixin qq com s BejnKBp1OGTyW2SzHiCwcw 有安装 卸载 下载 xff0c 升级等使用方法 再贴个图 xff1a
  • 如何搭建高质量、高效率的前端工程体系--页面结构继承

    推荐理由 xff1a 推荐理由 xff1a 程序员在我们的印象中 xff0c 就是不停的敲代码 xff1b 而写的程序如何确保不出现bug 而且还能及时发现问题 xff0c 下面我推荐的这篇文章 xff0c 围绕整个前端的开发流程出发解决这
  • onnx删除无用属性

    这里写自定义目录标题 在推理onnx模型时 xff0c 报了一个错 xff0c 如下 xff1a InvalidGraph ONNXRuntimeError 10 INVALID GRAPH This is an invalid model
  • onnx模型显示输出形状

    在用netron查看模型时 xff0c 希望看到各个节点的shape xff0c 可以执行以下代码 1 依赖包 pip install onnx pip install onnx graphsurgeon index url https p
  • 使用opencv截取旋转框目标

    使用opencv截取旋转框目标 1 第一种方法2 第二种方法3 两种方法的简单对比4 opencv 最小面积矩形返回角度的理解4 1 version4 2之前4 2 version4 2之后 本文列举了两种方法 xff0c 使用的数据如图
  • Tensorflow pytorch及paddle交叉熵损失函数类标签及label smooth配置方法

    交叉熵损失函数类标签及label smooth配置方法 1 无class weight 无label smooth1 1 pytorch 输出1 2 paddle 输出1 3 tensorflow 输出 2 有label smooth 没有