Review — Adversarial Examples Improve Image Recognition

Training with Both Clean Samples and Adversarial Samples

4 min readJul 4, 2022

Adversarial Examples Improve Image Recognition
AdvProp, by Google, and Johns Hopkins University
2020 CVPR, Over 280 Citations (Sik-Ho Tsang @ Medium)
Image Classification, Adversarial Attack

Adversarial examples are commonly viewed as a threat to ConvNets.
In this paper, AdvProp, an enhanced adversarial training scheme, is proposed, which treats adversarial examples as additional examples to improve the classification accuracy.
This is the first paper to use adversarial samples to improve the performance.

Outline

Adversarial Attack Preliminaries
AdvProp
Experimental Results

1. Adversarial Attack Preliminaries

**Adversarial Attack by Adding Image Noise**

The above shows a typical example of adversarial attack.
Originally, the input image x is correctly classified as panda.
With added noise, the network misclassifies panda as gibbon.
Thus, proposed approaches for defending the attacks usually sacrifices the accuracy, to be robust to such attacks.

However, in this paper, adversarial samples are used to improve the accuracy.

2. AdvProp

2.1. Adversarial Training

The vanilla training without adversarial samples is:

where D is the underlying data distribution, L(·) is the loss function, θ is the network parameter, and x is training sample with ground-truth label y.
Consider Madry’s adversarial training framework [23], instead of training with original samples, it trains networks with maliciously perturbed samples:

where ε is a adversarial perturbation, S is the allowed perturbation range.
Unlike Madry’s adversarial training, the main goal here is to improve network performance on clean images by leveraging the regularization power of adversarial examples.
Therefore adversarial images are treated as additional training samples and the networks are trained with a mixture of adversarial examples and clean images:

However, directly optimizing the above equation generally yields lower performance than the vanilla training setting on clean images.
It is hypothesized that the distribution mismatch between adversarial examples and clean images prevents networks from accurately and effectively distilling valuable features from both domains.

Thus, a auxiliary batch norm design is proposed to properly disentangle different distributions.

2.2. Disentangled Learning via An Auxiliary BN

Specifically, BN normalizes input features by the mean and variance computed within each mini-batch.

This normalization behavior could be problematic if the mini-batch contains data from different distributions, therefore resulting in inaccurate statistics estimation.

An auxiliary BN to guarantee its normalization statistics are exclusively preformed on the adversarial examples.

This proposed auxiliary BN helps to disentangle the mixed distributions by keeping separate BNs to features that belong to different domains.

2.3. Overall Approach

For each clean mini-batch, the network is first attacked using the auxiliary BNs to generate its adversarial counterpart.
Next, the clean mini-batch and the adversarial mini-batch are fed to the same network but applied with different BNs for loss calculation, i.e., the main BNs are used for the clean mini-batch and the auxiliary BNs are used for the adversarial mini-batch.
Finally, the total loss is minimized.

3. Experimental Results

3.1. ImageNet

**AdvProp boosts model performance over the vanilla training baseline on ImageNet**

EfficientNet is used. Projected Gradient Descent (PGD) [23] is used as attacker.

As seen above, the proposed AdvProp substantially outperforms the vanilla training baseline on all networks.

For example, the performance gain is at most 0.4% for networks smaller than EfficientNet-B4, but is at least 0.6% for networks larger than EfficientNet-B4.

3.2. Generalization on Distorted ImageNet Datasets

**AdvProp significantly boost models’ generalization ability on ImageNet-C, ImageNet-A and** **Stylized-ImageNet**

The proposed AdvProp consistently outperforms the vanilla training baseline for all models on all distorted datasets.

These are the best results so far if models are not allowed to train with corresponding distortions [6] or extra data [24, 46].
(There are still other results, please feel free to read the paper directly.)

Reference

[2020 CVPR] [AdvProp]
Adversarial Examples Improve Image Recognition

Image Classification

1989 … 2020 … [AdvProp] 2021 [Learned Resizer] [Vision Transformer, ViT] [ResNet Strikes Back] [DeiT] [EfficientNetV2] [MLP-Mixer] [T2T-ViT] [Swin Transformer] [CaiT] [ResMLP] [ResNet-RS] [NFNet] [PVT, PVTv1] [CvT] [HaloNet] [TNT] [CoAtNet] [Focal Transformer] [TResNet] [CPVT] [Twins] 2022 [ConvNeXt]

Review — Adversarial Examples Improve Image Recognition

Training with Both Clean Samples and Adversarial Samples

Outline

1. Adversarial Attack Preliminaries

2. AdvProp

2.1. Adversarial Training

2.2. Disentangled Learning via An Auxiliary BN

2.3. Overall Approach

3. Experimental Results

3.1. ImageNet

3.2. Generalization on Distorted ImageNet Datasets

Reference

Image Classification

My Other Previous Paper Readings

Written by Sik-Ho Tsang

Responses (1)