Reading: IDN — Information Distillation Network (Super Resolution)

With Faster Execution Time, Outperforms MemNet, DRRN, LapSRN, DRCN & VDSR

PSNR Against Execution Time, IDN has higher PSNR with faster processing time
  • By combining an enhancement unit with a compression unit into a distillation block, the local long and short-path features can be effectively extracted.
  • Fast execution since comparatively few numbers of filters per layer are used due to the use of group convolution.


  1. IDN: Network Architecture
  2. Distillation Block
  3. Experimental Results

1. IDN: Network Architecture

IDN: Network Architecture
  • x and y are denoted as the input and the output of IDN.
  • With respect to FBlock, two 3×3 convolutional layers are utilized to extract the feature maps from the original LR image.
  • The next part is composed of multiple information distillation blocks by using chained mode. Each block, DBlock, contains an enhancement unit and a compression unit with stacked style.
  • Finally, we take a transposed convolution without activation function as the RBlock.
  • The whole IDN becomes:

2. Distillation Block

2.1. Enhancement Unit

Distillation Block
  • The enhancement unit can be roughly divided into two modules, one is the above three convolutions and another is the below three convolutions.
  • The above module has three 3×3 convolutions, each of them is followed by a LReLU.
  • Let’s denote the feature map dimensions of the i-th layer as Di. The dimension of channels in the above module:
  • The output of the third convolution layer is sliced into two segments.
  • The feature maps with D3/s dimensions of Pk1 and the input of the first convolutional layer are concatenated in the channel dimension:
  • Moreover, S(Pk1,1/s) concatenates features with Bk-1 in channel dimension.
  • Finally, the input information, the reserved local short-path information and the local long-path information are aggregated:

2.2. Compression Unit

  • The outputs of the enhancement unit are sent to a 1×1 convolution layer, which acts as dimensionality reduction or distilling relevant information for the later network.

3. Experimental Results

3.1. Training

  • 91 images from Yang and 200 images from Berkeley Segmentation Dataset (BSD).
  • Data augmentation in three ways: (1) Rotate the images with the degree of 90, 180 and 270. (2) Flip images horizontally. (3) Downscale the images with the factor of 0.9, 0.8, 0.7 and 0.6.
  • Therefore, for example, 15²/43² training pairs are generated for training stage and 26²/76² LR/HR sub-images pairs are utilized for fine-tuning phase.
  • Finally, a 31-layer network is used as IDN.
  • IDN has 4 DBlocks, and the parameters D3, d and s of enhancement unit in each block are set to 64, 16 and 4 respectively.
  • To reduce the parameters of network, the grouped convolution layer, originated in AlexNet & ResNeXt, is used in the second and fourth layers in each enhancement unit with 4 groups.
  • In addition, the transposed convolution adopts 17×17 filters for all scaling factors and the negative scope of LReLU is set as 0.05.

3.2 Testing

Average PSNR/SSIMs for scale ×2, ×3 and ×4. Red color indicates the best and blue color indicates the second best performance.
  • The performance of the proposed IDN is lower than that of MemNet in Urban100 dataset and ×3, ×4 scale factors, while IDN can achieve slightly better performance in other benchmark datasets.
  • Authors think that MemNet takes an interpolated LR image as its input so that more information is fed into the network and the process of the SR only needs to correct the interpolated image.
  • IDN achieves the best performance and outperforms MemNet by a considerable margin.

3.3. Visual Comparisions

3.4. Running Time



PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store