Review — Adversarial Examples Improve Image Recognition

Training with Both Clean Samples and Adversarial Samples

AdvProp Improves Image Recognition

Adversarial Examples Improve Image Recognition
AdvProp
, by Google, and Johns Hopkins University
2020 CVPR, Over 280 Citations (Sik-Ho Tsang @ Medium)
Image Classification, Adversarial Attack

  • Adversarial examples are commonly viewed as a threat to ConvNets.
  • In this paper, AdvProp, an enhanced adversarial training scheme, is proposed, which treats adversarial examples as additional examples to improve the classification accuracy.
  • This is the first paper to use adversarial samples to improve the performance.

Outline

  1. Adversarial Attack Preliminaries
  2. AdvProp
  3. Experimental Results

1. Adversarial Attack Preliminaries

Adversarial Attack by Adding Image Noise
  • The above shows a typical example of adversarial attack.
  • Originally, the input image x is correctly classified as panda.
  • With added noise, the network misclassifies panda as gibbon.
  • Thus, proposed approaches for defending the attacks usually sacrifices the accuracy, to be robust to such attacks.

However, in this paper, adversarial samples are used to improve the accuracy.

2. AdvProp

2.1. Adversarial Training

  • The vanilla training without adversarial samples is:
  • where D is the underlying data distribution, L(·) is the loss function, θ is the network parameter, and x is training sample with ground-truth label y.
  • Consider Madry’s adversarial training framework [23], instead of training with original samples, it trains networks with maliciously perturbed samples:
  • where ε is a adversarial perturbation, S is the allowed perturbation range.
  • Unlike Madry’s adversarial training, the main goal here is to improve network performance on clean images by leveraging the regularization power of adversarial examples.
  • Therefore adversarial images are treated as additional training samples and the networks are trained with a mixture of adversarial examples and clean images:
  • However, directly optimizing the above equation generally yields lower performance than the vanilla training setting on clean images.
  • It is hypothesized that the distribution mismatch between adversarial examples and clean images prevents networks from accurately and effectively distilling valuable features from both domains.

Thus, a auxiliary batch norm design is proposed to properly disentangle different distributions.

2.2. Disentangled Learning via An Auxiliary BN

Traditional BN usage
  • Specifically, BN normalizes input features by the mean and variance computed within each mini-batch.

This normalization behavior could be problematic if the mini-batch contains data from different distributions, therefore resulting in inaccurate statistics estimation.

Utilization of Auxiliary BN
  • An auxiliary BN to guarantee its normalization statistics are exclusively preformed on the adversarial examples.

This proposed auxiliary BN helps to disentangle the mixed distributions by keeping separate BNs to features that belong to different domains.

2.3. Overall Approach

AdvProp Alogrithm
  1. For each clean mini-batch, the network is first attacked using the auxiliary BNs to generate its adversarial counterpart.
  2. Next, the clean mini-batch and the adversarial mini-batch are fed to the same network but applied with different BNs for loss calculation, i.e., the main BNs are used for the clean mini-batch and the auxiliary BNs are used for the adversarial mini-batch.
  3. Finally, the total loss is minimized.

3. Experimental Results

3.1. ImageNet

AdvProp boosts model performance over the vanilla training baseline on ImageNet
  • EfficientNet is used. Projected Gradient Descent (PGD) [23] is used as attacker.

As seen above, the proposed AdvProp substantially outperforms the vanilla training baseline on all networks.

  • For example, the performance gain is at most 0.4% for networks smaller than EfficientNet-B4, but is at least 0.6% for networks larger than EfficientNet-B4.

3.2. Generalization on Distorted ImageNet Datasets

AdvProp significantly boost models’ generalization ability on ImageNet-C, ImageNet-A and Stylized-ImageNet

The proposed AdvProp consistently outperforms the vanilla training baseline for all models on all distorted datasets.

  • These are the best results so far if models are not allowed to train with corresponding distortions [6] or extra data [24, 46].
  • (There are still other results, please feel free to read the paper directly.)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store