# Brief Review — Representation Learning by Learning to Count

## Self-Supervised Learning By Counting Number of Visual Features

Representation Learning by Learning to Count,Counting, by University of Bern, and University of Maryland2017 ICCV, Over 300 Citations(Sik-Ho Tsang @ Medium)

Self-Supervised Learning, Image Classification

- Self-supervised learning is achieved by
**counting number of visual features**.

# Outline

**Counting****Results**

**1. Counting**

## 1.1. Conceptual Idea

- To obtain a supervision signal useful to learn to count,
**an image is partitioned into non-overlapping regions.****The number of visual primitives in each region should sum up to the number of primitives in the original image.**

It is hypothesized that the model needs to

disentangle the image into high-level factors of variation, such that thecomplex relation between the original image and its regions is translated to a simple arithmetic operation.

## 1.2. Contrastive Loss

- Assume
is*x***color image input**, the naïve way for training the network,is*D***downsampling**operator,is*T***tiling**operator to divide the image x into 4 non-overlapping parts:

- where
is the*Φ***CNN**to be**learnt to count the visual features**. - However, this loss has trivial solution of equaling to zero.

To avoid such a scenario,

a contrastive lossis used toenforce that the counting feature should be different between two randomly chosen different images.

- Therefore, for any
*x*≠*y*, we would like to minimize:

- where the constant scalar
*M*=10. - The contrastive term will introduce a tradeoff that will push features towards counting as many primitives as is needed to differentiate images from each other.

## 1.3. Network Architecture

# 2. Results

The proposed method

either outperforms previous methods or achieve the second best performance.

The proposed method achieves a

performance comparableto the other state-of-the-art methodson the ImageNet datasetand shows asignificant improvement on the Places dataset.

## Reference

[2017 ICCV] [Counting]

Representation Learning by Learning to Count

## 1.2. Unsupervised/Self-Supervised Learning

**1993** … **2017** [Counting] … **2021** [MoCo v3] [SimSiam] [DINO] [Exemplar-v1, Exemplar-v2] [MICLe] [Barlow Twins] [MoCo-CXR] [W-MSE] [SimSiam+AL] [BYOL+LP] **2022** [BEiT] [BEiT V2]