Review: GAN — Generative Adversarial Nets (GAN)
Generator and Discriminator Trained Together, Invented By Ian GoodFellows. By using GAN, we can synthesize some very realistic good samples.
In this story, GAN (Generative Adversarial Nets), by Universite de Montreal, is briefly reviewed. This is a very famous paper. The last author is Yoshua Bengio, who has just won the 2018 Turing Award, together with Geoffrey Hinton and Yann LeCun. The Turing Award is generally recognized as the highest distinction in computer science and the “Nobel Prize of computing”. The first author is Ian Goodfellow. After inventing GAN, he is a very famous guy now. With the second last author, Aaron Courville, they three together have already published a deep learning book which got very high citations as well.
In this paper, by using GAN, we can synthesize some very realistic good samples from the network by inputting a signal noise. Two models are trained simultaneously:
- Generative model G: captures the data distribution.
- Discriminative model D: estimates the probability that a sample came from the training data rather than G.
This is a 2014 NIPS paper with more than 10000 citations. (Sik-Ho Tsang @ Medium)
Outline
- GAN Value Function
- GAN Conceptual Idea
- GAN Algorithm
- GAN Results
1. GAN Value Function
- Two models are trained Generative model G and Discriminative model D. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game.
- These two models can be any kinds of models and optimization algorithms. In this paper, both are the type of multilayer perceptron.
- D is to classify the input sample whether it is real data or fake data generated by G.
- D: A multilayer perceptron D(x;θ_d) is also defined that outputs a single scalar. D(x) represents the probability that x came from the data rather than p_g.
- D is trained to maximize the probability of assigning the correct label to both training examples and samples from G.
Since D(x) outputs the probability which ranges from 0 to 1, log of D(x) ranges from -ve infinity to 0.
When D guess correctly for real data, D(x) close to 1, log D(x) close to 0 which can maximize the function above.
When D guess correctly for fake data, D(G(z)) close to 0, log (1-D(G(z))) close to 0 which can maximize the function above.
- G is to generate a sample that is look like coming from real data x.
- G: To learn the generator’s distribution p_g over data x, a prior is defined on input noise variables p_z(z), then represent a mapping to data space as G(z;θ_g), where G is a differentiable function represented by a multilayer perceptron with parameters θ_g.
- G is also trained simultaneously to minimize log(1-D(G(z))).
The objective of G is to generate samples such that D cannot distinguish whether it is real or fake data, and finally D can only have a random guess of 1/2.
2. GAN Conceptual Idea
- Blue: Discriminative distribution, Black: Real data distribution p_data, Green: generative distribution p_g
- (a): At early stage, p_g is similar to p_data, D is a partially accurate classifier.
- (b): D is trained to discriminate samples from data.
- (c): G is updated which have p_g more likely as p_data.
- (d): After several steps of training, the ideal case is to have p_g = p_data, such that D(x)=1/2. That means samples generated from p_g looks very real as from p_data.
3. GAN Algorithm
- First, we got k steps to update D. Then, 1 step to update G.
- The above 2 procedures are looped until the generator can synthesize/generate samples that very look like real data.
4. GAN Results
Reference
[2014 NIPS] [GAN]
Generative Adversarial Nets
My Previous Reviews
Image Classification [LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [ResNet-38] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [DMRNet / DFN-MR] [IGCNet / IGCV1] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2]
Object Detection [OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]
Semantic Segmentation [FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3] [ResNet-38] [ResNet-DUC-HDC] [LC] [FC-DenseNet] [IDW-CNN] [DIS] [SDN]
Biomedical Image Segmentation [CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet] [Cascaded 3D U-Net]
Instance Segmentation [SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]
Super Resolution [SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [SRDenseNet]
Human Pose Estimation [DeepPose] [Tompson NIPS’14] [Tompson CVPR’15] [CPM]
Codec Post-Processing [ARCNN] [Lin DCC’16] [IFCNN] [Li ICME’17] [VRCNN] [DCAD] [DS-CNN]
Generative Adversarial Network [GAN]