Brief Review — Self-Supervised Learning for Medical Image Analysis Using Image Context Restoration

Learning Representation by Context Restoration (CR)

Sik-Ho Tsang
3 min readOct 23, 2022


Context Restoration (CR) for 3 Different Tasks: Classification, Localization, and Segmentation

Self-Supervised Learning for Medical Image Analysis Using Image Context Restoration,
Context Restoration, by Imperial College London, Nagoya University, Aichi Cancer Centre, and Nagoya University Hospital
2019 JMIA, Over 200 Citations (Sik-Ho Tsang @ Medium)
Self-Supervised Learning, Medical Image Analysis, Image Classification, Object Detection, Image Segmentation


  1. Context Restoration (CR)
  2. Results

1. Context Restoration (CR)

Generating training images for self-supervised context disordering: Brain T1 MR image, abdominal CT image, and 2D fetal ultrasound image, respectively. In the second column, red boxes highlight the swapped patches after the first iteration
  • Given an image xi, two isolated small patches in xi are randomly selected and swapped. This process is repeated for T times results in ˜xi, as shown above.

A CNN is to be learnt to restore the context.

General CNN architecture for the context restoration self-supervised learning. The blue, green, and orange strides represent convolutional units, down- sampling units, and upsampling units, respectively.
  • In the analysis part, the architecture is similar to that of the VGGNet.
  • In the reconstruction part, CNN structures could vary depending on subsequent task type.
  • For subsequent classification tasks, the simple structures such as a few deconvolution layers (2nd row) are preferred.
  • For subsequent segmentation tasks, a network which is in symmetry with the analysis part using concatenation connections, which is similar to a U-Net.
  • L2 loss is used.

2. Results

2.1. 2D Ultrasound Image Classification

The classification of standard scan planes of fetal 2D ultrasound images.
Self-supervision using context restoration: For brain MR images, the training is on 2D image patch level. Therefore, the context restoration is also based on patches
  • In practice, SonoNet-64 (Baumgartner et al., 2017) is used.

Context restoration pretraining improves the SonoNet performance the most. This suggests that context restoration pretraining is more useful for image classification in this case.

2.2. Abdominal Multi-Organ Localization

The performance of the CNN solving the multi-organ localization problem in different training settings
  • The CNN for multi-organ localization task is similar to the SonoNet (Baumgartner et al., 2017), but it has one more stack of convolution and pooling layers to reduce the output size.

Initialising by pretrained features, particularly those from context restoration tasks, improves the CNN performance.

2.3. Brain Tumor Segmentation

The segmentation results of the customised U-Nets

U-Nets initialised by context restoration pretraining achieve the best performance in total.


[2019 JMIA] [Context Restoration]
Self-Supervised Learning for Medical Image Analysis Using Image Context Restoration

1.2. Unsupervised/Self-Supervised Learning

19932019 [Context Restoration] … 2021 [MoCo v3] [SimSiam] [DINO] [Exemplar-v1, Exemplar-v2] [MICLe] [Barlow Twins] [MoCo-CXR] [W-MSE] [SimSiam+AL] [BYOL+LP] 2022 [BEiT] [BEiT V2]

1.9. Biomedical Image Classification

20172019 [Context Restoration] … 2021 [MICLe] [MoCo-CXR] [CheXternal] [CheXtransfer] [Ciga JMEDIA’21]

1.10. Biomedical Image Segmentation

2015 … 2019 [Context Restoration] … 2021 [Ciga JMEDIA’21]

My Other Previous Paper Readings



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.