Review — Up-Net: Towards Better Semantic Consistency of 2D Medical Image Segmentation

Up-Net, Adding SE Module Concept, Originated in SENet, to U-Net

Sik-Ho Tsang
4 min readMar 19, 2023

Towards Better Semantic Consistency of 2D Medical Image Segmentation,
Up-Net, by University of Electronic Science and Technology of China, and King’s College London
2021 Elsevier J. VCIR (

@ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net

4.2. Biomedical Image Segmentation
2015 … 2021
[Expanded U-Net] [3-D RU-Net] [nnU-Net] [TransUNet] [CoTr] [TransBTS] [Swin-Unet] [Swin UNETR] [RCU-Net] [IBA-U-Net] [PRDNet] 2022 [UNETR]
==== My Other Paper Readings Also Over Here ====

  • A novel attentional up-concatenation structure to build an auxiliary path for direct access to multi-level features.
  • In addition, a new structural loss is employed to bring better morphological awareness and reduce the segmentation flaws caused by the semantic inconsistencies.


  1. Up-Net
  2. Structure Loss
  3. Results

1. Up-Net

Up-Net Model Architecture
  • U-Net encoder-decoder structure is used with MobileNetV2 as backbone, which is ImageNet pre-trained.
  • An additional up-connection path, namely up-concatenation, to bridge the high-level semantics with low-level details within the decoding path of U-Net.
The attention block for attentional up-concatenation.
  • At the end, by concatenating and convolving the multi-level features using the attentional up-concatenation, gradients can also be backpropagated to all levels of decoders.
  • (Please feel free to read SENet for more details.)
  • Different versions of proposed network are named as Up-Net (N1) to (N4). The Up-Net (N4) uses the entire four levels of features at the end, while Up-Net (N1) uses only the last level.

2. Structure Loss

  • The total loss has two major terms:
  • which are Mixed Dice Loss and Structure Loss.

2.1. Mixed Dice Loss

  • Cross-entropy loss is commonly used for such a pixel-wise classification task by optimizing both foreground and background pixels:
  • The imbalance in the number of foreground and background pixels can introduce bias into the model. Therefore, the Dice coefficient loss is used to solve this problem, defined as
  • LDice may sometimes fail in convergence. So, LDice is mixed with minor LCE to avoid such problem similar in [41], namely mixed dice loss, defined as:
  • 𝜆CE and 𝜆Dice are empirically set as 0.01 and 1.0.

2.2. Structure Loss

  • The edge-aware loss encourages the model to discover distinguished differences between neighboring pixels, defined as:
  • While the ground truth labels of central pixel 𝑖 and its neighbor pixels 𝑗 are belong to different classes (𝐶𝑖,𝑗 = 1), the LEdge will urge them to have contrary predictions.
  • Different from LEdge, the connection-aware loss encourages the network to discover homogeneous similarities between neighboring regions, defined as:
  • The structural loss is defined as the sum of 𝐿𝐶𝑜𝑛𝑛𝑒𝑐𝑡𝑖𝑜𝑛 and 𝐿𝐸𝑑𝑔𝑒:

2. Results

2.1. Optic Disc/Cup Segmentation

Performance comparison of Optic Disc/Cup segmentation on DI(%) and JC(%).

From the comparison, the proposed Up-Net performs better than the state-of-the-art OC/OD segmentation methods in all four datasets.

2.2. Cellular Segmentation

Results (%) of cellular segmentation on TNBC and GlaS datasets.

Up-Net (N4) outperforms the state-of-the-art methods in all aspects of F1-score, Dice, and accuracy.

2.3. Lung Segmentation

Segmentation results (%) of LUNA dataset.

Up-Net obtains better semantics consistency and successfully avoids the overfilled flaw compared to the result of DeepLabv3+.

2.4. Visual Quality

Comparison of optic cup segmentation results between Up-Net and the state-of-the-arts.
Visual comparisons of optic disc, cellular and lung segmentation results between Up-Net (N4) and DeeplabV3+ in various datasets, including (a) DRISHTI-GS; (b) RIM-r3; (c) REFUGE; (d) MESSIDOR; (e) GlaS Test A; (f) GlaS Test B; (g) TNBC; (h–i) LUNA.

Up-Net (N4) obtains the most accurate segmentation.

(There are still other experimental results, .e.g.: ablation study, please free feel to read the paper directly.)



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.