Brief Review — Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Perceptual Loss: Feature Reconstruction Loss + Style Reconstruction Loss

4 min readNov 10, 2022

**Comparison with** **Image Style Transfer** **[10] and** **SRCNN** **[11]**

Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Perceptual Loss, by Stanford University,
2016 ECCV, Over 7000 Citations (Sik-Ho Tsang @ Medium)
Image Style Transfer, Super Resolution

Perceptual loss is introduced for high-quality image style transfer.
This perceptual loss is used in many different domains later on. This is a paper by Li Fei-Fei research group.

Outline

Perceptual Loss Network Architecture
Perceptual Loss Functions
Results

1. Perceptual Loss Network Architecture

The proposed system consists of two components:

1.1. Loss Network

A loss network Φ that is used to define several loss functions l1, …, lk. Each loss function computes a scalar value li(^y, yi) measuring the difference between the output image ^y and a target image yi.
The loss network is a frozen ImageNet-pretrained VGG-16.

The loss network is used to define a feature reconstruction loss lΦfeat and a style reconstruction loss lΦstyle that measure differences in content and style between images.

1.2. Image Transformation Network

An image transformation network fW, which is a deep residual convolutional neural network parameterized by weights W.

It is trained using stochastic gradient descent to minimize a weighted combination of loss functions:

It consists of five residual blocks, with some modifications. (Please feel free to read the paper for detailed modifications.)

2. Perceptual Loss Functions

2.1. Feature Reconstruction Loss

This loss encourages them to have similar feature representations as computed by the loss network. Let Φj(x) be the activations of the jth layer of the network Φ when processing the image x.
The feature reconstruction loss is the (squared, normalized) Euclidean distance between feature representations:

**Optimization to minimize the feature reconstruction loss**

As images are reconstructed from higher layers, image content and overall spatial structure are preserved but color, texture, and exact shape are not.

2.2. Style Reconstruction Loss

Similar to Image Style Transfer, the Gram matrix GΦj(x) to be the Cj×Cj matrix whose elements are given by:

The style reconstruction loss is then the squared Frobenius norm of the difference between the Gram matrices of the output and target images:

**Optimization to minimize the style reconstruction loss**

This loss preserves stylistic features from the target image, but does not preserve its spatial structure.

3. Results

3.1. Style Transfer

**Example results of style transfer using our image transformation networks**

**Example results for style transfer on 512×512 images.**

Feature reconstruction loss is computed at layer relu2_2 and style reconstruction loss is computed at layers relu1_2, relu2_2, relu3_3, and relu4_3 of the VGG-16 loss network.
It is clear that the trained style transfer network is aware of the semantic content of images.
For example in the beach image in the above figure, the people are clearly recognizable in the transformed image but the background is warped beyond recognition. Similarly in the cat image, the cat’s face is clear in the transformed image, but its body is not.
One explanation is that the VGG-16 loss network has features which are selective for people and animals since these objects are present in the classification dataset on which it was trained.

The proposed method is three orders of magnitude faster than Image Style Transfer. It processes images of size 512×512 at 20 FPS, making it feasible to run style transfer in real-time or on video.

3.2. Single Image Super Resolution

The proposed method obtains lower PSNR and SSIM, but with more pleasant images since perceptual loss does not optimize PSNR/SSIM using l1/l2 loss, which has similar spirit of SRGAN.

Reference

[2016 ECCV] [Perceptual Loss]
Perceptual Losses for Real-Time Style Transfer and Super-Resolution

5.2. Style Transfer

2016 [Image Style Transfer] [Perceptual Loss]

Brief Review — Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Perceptual Loss: Feature Reconstruction Loss + Style Reconstruction Loss

Outline

1. Perceptual Loss Network Architecture

1.1. Loss Network

1.2. Image Transformation Network

2. Perceptual Loss Functions

2.1. Feature Reconstruction Loss

2.2. Style Reconstruction Loss

3. Results

3.1. Style Transfer

3.2. Single Image Super Resolution

Reference

5.2. Style Transfer

My Other Previous Paper Readings

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Sik-Ho Tsang

No responses yet