Member-only story

Review — Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

Axial-DeepLab, for Both Image Classification & Segmentation

6 min readFeb 22, 2023

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation,
Axial-DeepLab, by Johns Hopkins University, and Google Research,
2020 ECCV, Over 400 Citations (Sik-Ho Tsang @ Medium)
Image Classification, Panoptic Segmentation, Instance Segmentation, Semantic Segmentation
==== My Other Paper Readings Are Also Over Here ====

Conventional 2D self-attention which has very high computational complexity is factorized into two 1D self-attentions.
A position-sensitive self-attention design is proposed.
Combining both yields the position-sensitive axial-attention layer.
By stacking the position-sensitive axial-attention layers, Axial-DeepLab models are formed for image classification and dense prediction.

Outline

Position-Sensitive Axial-Attention Layer
Axial-DeepLab
Results

1. Position-Sensitive Self-Attention Layer

1.1. Conventional Self-Attention

Given an input feature map x with height h, width w, and channels din, the output at position o=(i, j), yo computed by pooling over the projected input as:
where N is the whole location lattice, and queries qo=WQxo, keys ko=WKxo, values vo=WVxo are all linear projections of the input xo.
However, self-attention is extremely expensive to compute (O(h²w²)). Later in next section, Position-Sensitive Axial-Attention Layer is used to reduce this complexity.
Another drawback is that the global pooling does not exploit positional information, which is critical to capture spatial structures or shapes in vision tasks. Position-Sensitive Self-Attention helps to solve this issue.

1.2. Position-Sensitive Self-Attention

SASA proposed to include rp-o relative positional encoding:
where Nm×m(o) is the local m×m square region around o=(i, j).

In this paper, the relative positional encodings rqp-o, rkp-o, rvp-o for query, key and value are also added:
Position-sensitive…

Review — Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

Axial-DeepLab, for Both Image Classification & Segmentation

Outline

1. Position-Sensitive Self-Attention Layer

1.1. Conventional Self-Attention

1.2. Position-Sensitive Self-Attention

Written by Sik-Ho Tsang

No responses yet