Brief Review — AC-GAN: Conditional Image Synthesis With Auxiliary Classifier GANs
AC-GAN, Conditioned on Both Noise Vector and Class Label
3 min readAug 13, 2023
Conditional Image Synthesis With Auxiliary Classifier GANs
AC-GAN, by Google Brain
2017 ICML, Over 3400 Citations (Sik-Ho Tsang @ Medium)Generative Adversarial Network (GAN)
Image Synthesis: 2014 … 2019 [SAGAN]
==== My Other Paper Readings Are Also Over Here ====
- Auxiliary Classifier GAN (AC-GAN), is proposed, which employs label conditioning that results in 128×128 resolution image samples exhibiting global coherence.
Outline
- Auxiliary Classifier GAN (AC-GAN)
- Results
1. Auxiliary Classifier GAN (AC-GAN)
1.1. Loss Functions
In AC-GAN, every generated sample has a corresponding class label, c ~ pc in addition to the noise z. The generator G uses both to generate images Xfake = G(c, z).
- The discriminator gives both a probability distribution over sources and a probability distribution over the class labels, P(S | X), P(C | X) = D(X).
- The objective function has two parts: the loglikelihood of the correct source, LS, and the log-likelihood of the correct class, LC.
D is trained to maximize LS + LC while G is trained to maximize LC - LS. AC-GANs learn a representation for z that is independent of class label.
- Structurally, this model is not tremendously different from existing models. However, this modification to the standard GAN formulation produces excellent results and appears to stabilize training.
1.2. Model
- The structure of the AC-GAN model permits separating large datasets into subsets by class and training a generator and discriminator for each subset. All ImageNet experiments are conducted using an ensemble of 100 AC-GANs, each trained on a 10-class split, 50000 mini-batches of size 100.
- Broadly speaking, the architecture of the generator G is a series of deconvolutional layers that transform the noise z and class c into an image. Two variants of the model architecture are trained for generating images at 128×128 and 64×64 spatial resolutions.
- The discriminator D is a deep convolutional neural network with a Leaky ReLU nonlinearity.
2. Results
2.1. ImageNet
2.2. Latent Space Interpolation
- For detailed results, please feel free to read the paper directly.