Curriculum adversarial training

2023-05-16

Weakness of adversarial training: overfit to the attack in use and hence does not generalize to test data

Curriculum adversarial training

思想：train model from weak attack to strong attack

方法

Let l l l denote the attack strength, K K K denote the maximal attack strength. A ( l ) \mathcal{A}(l) A(l) denotes an attack class parameterized with l l l.

Basic curriculum learning

i). start from no attack;
ii). train the model for one epoch and, once finished, calculate the l ~ \tilde{l} l~-accuracy;
iii-a). if l ~ \tilde{l} l~ increases at least once over the last 10 epoches, continue training;
iii-b). if l ~ \tilde{l} l~ does not increase over the last 10 epoches, set the parameters of the model to be the best ones (i.e. 10 epoches ago), and increase l l l by 1;
iv). Stop when l > K l>K l>K.

Benefit: Training efficiency

Additional optimization technique: batch mixing

Motivation: The basic curriculum training can achieve a significantly reduction on the training time, it does not increase the robustness. One issues is forgetting \textcolor{red}{\text{\small forgetting}} forgetting: when the model is trained with a larger l l l, it will forget the adversarial examples generated for a smaller l l l.

Solution: Generate some adversarial examples using P G D ( i ) PGD(i) PGD(i) for each i ∈ { 0 , 1 , . . . , l } i \in \{0, 1, ..., l\} i∈{0,1,...,l}, and combine them to form a batch. The loss function is updated accordingly as:
∑ i = 0 k α i ∑ x , y ∼ D L ( f θ ( A i ( x ) , y ) , \sum_{i=0}^k \alpha_i \sum_{x,y \sim \mathcal{D}}\mathcal{L}(f_\theta(\mathcal{A}_i(x),y), i=0∑kαix,y∼D∑L(fθ(Ai(x),y),
where α i \alpha_i αi's are hyperparameters such as a i ∈ [ 0 , 1 ] , ∑ α i = 1 a_i \in [0,1],\sum \alpha_i=1 ai∈[0,1],∑αi=1. The authors set α i = 1 l + 1 \alpha_i=\frac{1}{l+1} αi=l+11 and generate the same amount of adversarial examples for each attack strength.

Additional optimization technique: quantization

Motivation: The model trained with CAT may not defend against attacks that are stronger than the strongest attack used during training.
Solution: Employ quantization, i.e. restrict x ∈ [ 0 , 1 ] x \in [0,1] x∈[0,1] to a b b b-bit integer.
Rationale: Quantization reduces the space of adversarial examples. Specifically, let x ⋆ x^\star x⋆ denotes the adversarial example. The difference of x ⋆ − x x^\star-x x⋆−x takes value from an infinite space if x x x is real-valued; in contrast, it takes value from a finite space if x x x is quantized to an integer vector.
Remark: Quantization is a generic inference time defense technique. This technique alone is not shown to provide resilience against strong white-box attacks. However, it is effective when using together with CAT since the model remembers adversarial example generated by weak attacks. Although a stronger attack can better optimize the loss function, the adversarial examples that it generates are highly likely to coincide with those generated by a weaker attack, because the entire adversarial example space is small.

实验：Improve both efficiency and empirical worst-case accuracy against adversarial examples (termed resilience)

文献：
Cai, Qi-Zhi, Chang Liu, and Dawn Song. “Curriculum adversarial training.” In Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 3740-3747. 2018.

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)