Review — DALS: Deep Active Lesion Segmentation

DALS, Using Fully Convolutional Network (FCN) and Active Contour Model (ACM)

Sik-Ho Tsang
5 min readFeb 2, 2023

Deep Active Lesion Segmentation,
DALS, by University of California, and Stanford University,
2019 MLMI, Over 40 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation

  • Deep Active Lesion Segmentation (DALS), a fully automated segmentation framework is introduced.
  • DALS leverages the powerful nonlinear feature extraction abilities of fully Convolutional Neural Networks (CNNs) and the precise boundary delineation abilities of Active Contour Models (ACMs).
  • A Multiorgan Lesion Segmentation (MLS) dataset, that contains images of various organs, is used for evaluation.

Outline

  1. Multiorgan Lesion Segmentation (MLS) Dataset
  2. Deep Active Lesion Segmentation (DALS) Framework
  3. Results

1. Multiorgan Lesion Segmentation (MLS) Dataset

Segmentation comparison of (a) medical expert manual with (b) the proposed DALS and (c) U-Net, in (1) Brain MR, (2) Liver MR, (3) Liver CT, and (4) Lung CT images.
MLS dataset statistics. GC: Global Contrast; GH: Global Heterogeneity.
  • The liver component of the dataset consists of 112 contrast-enhanced CT images of liver lesions (43 hemangiomas, 45 cysts, and 24 metastases), and 164 liver lesions from 3T gadoxetic acid enhanced MRI scans.
  • The brain component consists of 369 preoperative and pretherapy perfusion MR images.
  • The lung component consists of 87 CT images.
  • For each component of the MLS dataset, 85% of its images ae used for training, 10% for testing, and 5% for validation.

2. Deep Active Lesion Segmentation (DALS) Framework

The proposed DALS architecture.

2.1. Fully Convolutional Network (FCN)

  • The fully convolutional encoder-decoder architecture is used.
  • Dense block, originated from DenseNet, is used for each encoding block. In each dense block of the encoder, a composite function of batch normalization, convolution, and ReLU is applied to the concatenation of all the feature maps [x0, x1, …, xl-1] from layers 0 to l-1 with the feature maps produced by the current block.
  • The last dense block in the encoder is fed into a custom multiscale dilation block, as in DeepLab or DilatedNet, with 4 parallel convolutional layers with dilation rates of 2, 4, 8, and 16.
  • Before being passed to the decoder, the output of the dilated convolutions are then concatenated to create a multiscale representation of the input image thanks to the enlarged receptive field of its dilated convolutions.

This, along with dense connectivity, assists in capturing local and global context for highly accurate lesion localization.

  • The input image is fed into the encoder-decoder, which localizes the lesion and, after 1×1 convolutional and sigmoid layers, produces the initial segmentation probability map Yprob(x, y).

During training, Yprob and the ground truth map Ygt(x, y) are fed into a Dice loss function.

2.2. Active Contour Model (ACM)

  • The boundaries of the segmentation map generated by the encoder-decoder are ne-tuned by the level-set ACM.
  • The Transformer converts Yprob to a Signed Distance Map (SDM) Φ(x, y, 0) that initializes the level-set ACM. (Authors did not mention clearly whether it is Transformer layer as I thought about. For SDM, please feel free to read about SDM.)

In brief, ACMs leverage parametric (“snake”) or implicit (level-set) formulations in which the contour evolves by minimizing an associated energy functional, typically using a gradient descent procedure.

ACM, a.k.a. Snake, is a very famous non-deep-learning approach in computer vision to segment things.

Smoothed Heaviside Function H and Smoothed Dirac Function δ (Figure from “Active contours with selective local or global segmentation: A new formulation and level set method,” J. IMAVIS, 2009.)
  • Given an image I(x, y), let C(t)={(x, y) | Φ(x, y, t) = 0} be a closed time-varying contour represented in Ω. The interior and exterior regions of C are specified by the smoothed Heaviside function HIε(Φ), and HEε(Φ)=1-HIε(Φ). The narrow band near C is specified by the smoothed Dirac function δε(Φ). m1 and m2 as the mean intensities of I(x, y) inside and outside C and within Ws.
  • (I can only give the briefs of what terms involved in ACM, there are pretty much math involved here related to ACM or snake. Also, the paper did not provide too much details. If you’re interested, please feel free to read, (1) “Snakes: active contour models,” IJCV 1988, and (2) “Active contours with selective local or global segmentation: A new formulation and level set method,” J. IMAVIS, 2009.)
  • The energy functional associated with C can be written as
  • The energy density is:
Demostration of Snake (Figure from “Active contours with selective local or global segmentation: A new formulation and level set method,” J. IMAVIS, 2009.)

Intuitively, with initial rectangles given in the first row, Snake will find the segmentation boundary for it.

  • In this paper, λs are given by Yprob instead of giving it constant:

Then with initial ACM and λs, snake is performed as post-processing step to fine-tune the boundary.

  • The entire inference time for DALS takes 1.5 seconds.

3. Results

SOTA Comparisons
  • (There are errors in Fig.3, after I cross-checked the values, Fig. 3(a) should be Hausdorff Distance, Fig. 3(b) should be Dice Score).
  • DALS is compared against U-Net and manually-initialized level-set ACM with scalar λs parameter constants as well as its backbone CNN.

DALS achieves superior accuracies under all metrics and in all datasets.

  • (Indeed, by just looking at box plots, solely using CNN isn’t bad at all.)
Comparison of the output segmentation of DALS (red) against the U-Net (yellow) and manual “ground truth” (green) segmentations

The DALS segmentation contours conform appropriately to the irregular shapes of the lesion boundaries. In most cases, DALS avoided local minima and converged onto the true lesion boundaries, thus enhancing segmentation accuracy.

Visualizations of Different Maps

The learned λs maps serve as an attention mechanism that provides additional degrees of freedom for the contour to adjust itself precisely to regions of interest.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.