[Review] Pix2Pix: Image-to-Image Translation with Conditional Adversarial Networks (GAN)

Image-to-Image Translation Using CGAN+U-Net

Conditional adversarial nets (CGAN) are a general-purpose solution that appears to work well on a wide variety of these problems.

Outline

1. Brief Review of CGAN

Conditional GAN (CGAN)

2. Pix2Pix: Loss Function

2.1. Loss Function

2.2. Training & Inference

3. Pix2Pix: Network Architectures

Two choices: U-Net is an encoder-decoder with skip connections

4. Pix2Pix: PatchGAN

5. Experimental Results

5.1. Analysis of Objective Functions

Different losses induce different quality of results
FCN-scores for different losses
Color Distribution Matching Property of CGAN.

5.2. Analysis of Generator Architecture

Different architectures induce different quality of results
FCN-scores for different generator architectures

5.3. From PixelGANs to PatchGANs to ImageGANs

Patch size variations
FCN-scores for different receptive field sizes of the discriminator
Example results on Google Maps at 512x512 resolution (model was trained on images at 256 256 resolution, and run convolutionally on the larger images at test time).

5.4. Perceptual Validation

Real vs Fake Test on Photo to/from Map
Colorization
Real vs Fake Test on Colorization

5.5. Semantic Segmentation

Applying a conditional GAN to semantic segmentation
Performance of photo to labels on cityscapes

5.6. Community-Driven Research

Example applications developed by online community based on pix2pix
Learning to see: Gloomy Sunday

Reference

Generative Adversarial Network (GAN)

My Other Previous Paper Readings

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG