Brief Review — GhostNets on Heterogeneous Devices via Cheap Operations

CPU & GPU Efficient C-GhostNet & G-GhostNet

Sik-Ho Tsang
4 min readMay 19, 2023

GhostNets on Heterogeneous Devices via Cheap Operations,
C-GhostNet & G-GhostNet, by University of Chinese Academy of Sciences, Huawei Noah’s Ark Lab, The University of Sydney, and University of Macau,
2022 IJCV, Over 10 Citations (Sik-Ho Tsang @ Medium)

Image Classification
1989 … 2023 [Vision Permutator (ViP)] [ConvMixer]
==== My Other Paper Readings Are Also Over Here ====

  • By stacking the proposed CPU-efficient Ghost (C-Ghost) module, C-GhostNet is designed. Simiilarly, by stacking the proposed GPU-efficient Ghost (G-Ghost) module, G-GhostNet is designed.


  1. C-GhostNet
  2. G-GhostNet
  3. Results

1. C-GhostNet

1.1. Findings

Feature Map Visualizations
  • It is observed that there are many similar feature maps. There is redundancy in feature maps.

1.2. C-Ghost Module

Standard Convolution (Top), C-GhostNet Module (Bottom)

In brief for C-GhostNet module, some input feature maps are kept unchanged as output by identity connection. Some are passed through cheap operations to generate some new feature maps.

  • In practice, there could be several different cheap operations in a C-Ghost module, e.g., 3×3 and 5×5 linear kernels. Finally, 1×1 is chosen as cheap operation.

1.3. C-Ghost Bottleneck

C-Ghost Bottleneck

The proposed ghost bottleneck mainly consists of two stacked C-Ghost modules. The first C-Ghost module acts as an expansion layer increasing the number of channels. The second C-Ghost module reduces the number of channels to match the shortcut path. ReLU and BN is used.

1.4. C-GhostNet


C-GhostNet is formed by stacking ghost bottleneck.

2. G-GhostNet

2.1. Findings

Feature Map Visualizations

In this paper, it is even found that the feature maps at different layers/blocks are also similar.

2.2. G-Ghost Stage

G-Ghost Stage
  • (a) Vanilla CNN stages.
  • (b) C-Ghost module concept applied across multiple blocks.

(c) Mix operation added on top of (b).

Mix Operation
  • A global average pooling is applied to obtain the aggregated feature z.
  • A fully connected layer is then applied to transform z into the same domain as Y.

G-GhostNet is formed by using the G-Ghost module.

3. Results

3.1. C-GhostNet on ImageNet

C-GhostNet on ImageNet

C-GhostNet outperforms other competitors consistently at various computational complexity levels, since C-GhostNet is more efficient in utilizing computation resources for generating feature maps.

3.2. C-GhostNet on MS COCO

C-GhostNet on MS COCO

On MS COCO, with significantly lower computational costs, C-GhostNet achieves similar mAP with MobileNetV2 and MobileNetV3.

3.3. G-GhostNet on ImageNet

G-GhostNet on ImageNet

G-Ghost-RegNet achieves the optimal accuracy-FLOPs trade-off and accuracy-latency trade-off.

3.4. G-GhostNet on MS COCO

G-GhostNet on MS COCO

G-Ghost-RegNetX-3.2GF suppresses ResNet50 and RegNetX-3.2GF-0.75 by a significant margin, and meanwhile achieves a faster inference speed.



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.