Brief Review — cU-Net+PE: Simultaneous Segmentation and Classification of Bone Surfaces from Ultrasound Using a Multi-feature Guided CNN

Modified U-Net for Both Classification & Segmentation

Sik-Ho Tsang
4 min readNov 2, 2022


Simultaneous Segmentation and Classification of Bone Surfaces from Ultrasound Using a Multi-feature Guided CNN
cU-Net+PE, by Rutgers University, and Rutger Robert Wood Johnson Medical School, 2018 MICCAI, Over 40 Citations (Sik-Ho Tsang @ Medium)
Medical Image Analysis, Medical Image Classification, Medical Image Segmentation

  • U-Net is modified to support both classification and Segmentation at the same time.


  1. cU-Net & cU-Net+PE
  2. Results

1. cU-Net & cU-Net+PE

1.1. Inputs

From left to right: B-mode US scan, LPT, LP, BSE, bone-enhanced US scan.
  • The input takes the concatenation of B-mode Ultrasound (US) scan (US(x, y)) and three filtered image features:
  • 1.1.1. Local Phase Tensor Image (LPT(x, y)): LPT(x, y) image is computed by defining odd and even filter responses using [5]:
  • where Teven and Todd represent the symmetric and asymmetric features of US(x, y). H, ∇ and ∇2 represent the Hessian, Gradient and Laplacian operations, respectively:
  • 1.1.2. Local Phase Bone Image (LP(x, y)): LP(x, y) image is computed using:
  • where LPE(x, y) and LwPA(x, y) represent the local phase energy and local weighted mean phase angle image features, respectively:
  • 1.1.3. Bone Shadow Enhanced Image (BSE(x, y)): BSE(x, y) image is computed by modeling the interaction of the US signal within the tissue as scattering and attenuation information using [6]:
  • where CMLP(x, y) is the confidence map image obtained by modeling the propagation of US signal inside the tissue taking into account bone features present in LP(x, y) image [6]. USA(x, y), maximizes the visibility of high intensity bone features inside a local region.

Thus, except the US scan image input, the input also includes extracted features based on [5] and [6], i.e. LPT, LP, and BSE, which consists of a 4×256×256 matrix.

1.2. Model Architecture

Overview of the proposed simultaneous enhancement, segmentation and classification network, Blue: cU-Net, Red, cU-Net+PE
  • Pre-enhancing Network (PE): contains seven convolutional layers with 32 feature maps and one with single feature map.
  • U-Net: is used except the differences below:
  1. The MaxPooling layers and the convolutional layers in the contracting path are replaced by the convolutional layers with stride two.
  2. The feature maps at the last convolution layer of the contracting path (left side) is input to a classifier that consists of one fully-connected layer with a final 4-way softmax layer.
  3. BN is used before every ReLU layers.
  4. The number of starting feature maps is reduced from 32 to 16.
  • Cross entropy loss is used for both segmentation and classification tasks.

2. Results

  • A random split of US images from SonixTouch in training (80%) and testing (20%) sets, is used. The training set consists of a total of 415 images obtained from SonixTouch only. The rest 104 images from SonixTouch and all 131 images from Clarius C3 were used for testing.
From left to right column: B-mode US scans, PE, cU-net+PE, U-Net
AED, 95% confidence level (CL), recall, precision, and F-scores for the proposed and state of the art methods

The above table shows that the proposed cU-net+PE outperforms other methods on test scans obtained from both US machines.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.