Brief Review — PRDNet: Medical Image Segmentation Based on Parallel Residual and Dilated Network
PRDNet, Parellel Residual & Dilation Paths
PRDNet: Medical Image Segmentation Based on Parallel Residual and Dilated Network,
PRDNet, by Hebei University of Technology,
2021 Elsevier J. Measurement, Over 10 Citations (Sik-Ho Tsang @ Medium)
Medical Imaging, Medical Image Analysis, Image Segmentation4.2. Biomedical Image Segmentation
2015–2021 [Expanded U-Net] [3-D RU-Net] [nnU-Net] [TransUNet] [CoTr] [TransBTS] [Swin-Unet] 2022 [UNETR]
My Other Previous Paper Readings Also Over Here
- PRDNet (Parallel Residual and Dilated Network) is proposed, where ResNet and dilated convolution are simultaneously used to extract multilayer features of medical images in parallel.
- In the decoding stage, the multi-layer features are fused according to the structure of feature pyramid network (FPN).
Outline
- PRDNet
- Results
1. PRDNet
1.1. Backbone
- The input image size of PRDNet is 256×256.
- ResNet can be divided into five stages according to the size and the channels of the feature map.
- The first stage of ResNet consists of convolution and pooling, which down-samples the input image to 64 × 64. The stages from the second to the fifth are called 𝐶2, 𝐶3, 𝐶4 and 𝐶5, respectively.
- In ResNet-101, 𝐶2 does not change the size of the image but increases the channels of the feature map. Besides, the images are down-sampled at each stage starting with 𝐶3. Subsequently, the size of the image changes to 8×8.
In PRDNet, ResNet and dilated convolution share the 𝐶2 and 𝐶3 layers. After that, they are in a parallel way.
- Different with ResNet, the last two layers of dilated convolution are called 𝐶4𝑑 and 𝐶5𝑑 and they are both 32 × 32 in size.
At Last, ResNet and dilated convolution produce a total of six-layer features for subsequent fusion.
1.2. Fusion
- Feature pyramid is a bottom-up and top-down structure composed of multi-level features.
- At dilation branch, 𝐶4𝑑 and 𝐶5𝑑 are added to the up-sampled results from 𝐶5 layer and convolved respectively to obtain 𝑃4𝑑 and 𝑃5𝑑.
- At ResNet branch, 𝑃2-𝑃5 layers are obtained by up-sampling layer by layer according to the original FPN structure. Each layer is convolved, up-sampled, and added together.
- In order to get the final segmentation result, the output feature map is up-sampled to the same size as the input image and convolved to make the output channels equal to the number of labels.
1.3. Loss Function
- Cross-entropy loss is used:
2. Results
PRDNet achieves the best performance on the two datasets.
It is clear that PRDNet, Attention U-Net (A-UNet), DANet and SENet perform better at detecting target locations. The segmentation result of Attention U-Net (A-UNet) is inferior to PRDNet in detail.