Brief Review — Density Modeling of Images Using a Generalized Normalization Transformation

GDN, Used in Numerous Deep Image/Video Codec

Sik-Ho Tsang
3 min readJul 25, 2022

Density Modeling of Images Using a Generalized Normalization Transformation, GDN, by New York University
2016 ICLR, Over 200 Citations (Sik-Ho Tsang @ Medium)
Image Denoising, Transformation, Image Compression

  • A more generalized form of normalization technique, Generalized Divisive Normalization (GDN), is proposed.

Outline

  1. Goal
  2. Generalized Divisive Normalization (GDN)
  3. Results

1. Goal

  • Given a parametric family of transformations y=g(x, θ), authors wish to select parameters so as to transform the input vector x into a standard normal random vector:
  • where |.| denotes the absolute value of the matrix determinant. If py is the standard normal distribution (denoted N), the shape of px is determined solely by the transformation. Thus, g induces a density model on x, specified by the parameters θ.

2. Generalized Normalization Transformation (GDN)

2.1. Prior Divisive Normalization

  • Divisive normalization, proposed by (Carandini & Heeger, 2012) has a commonly used form:
  • where θ={α, β, γ} are parameters. Loosely speaking, the transformation adjusts responses to lie within a desired operating range, while maintaining their relative values.
  • A modified form of divisive normalization that uses a weighted L2-norm by (Lyu & Simoncelli, 2008):

2.2. Proposed Generalized Divisive Normalization (GDN)

  • GDN is defined as a composition of a linear transformation followed by a generalized form of divisive normalization:
  • The full parameter vector θ consists of the vectors β and ε, as well as the matrices H, α, and γ, for a total of 2N +3N² parameters (where N is the dimensionality of the input space).
  • Choosing εi=αi,j=γi,j=1 yields the classic form of the divisive normalization transformation (Carandini & Heeger, 2012), with exponents set to 1.
  • Choosing αi,j=2 and setting all elements of β, ε, and γ identical, the transformation assumes a radial form:
  • (Many other forms are mentioned in the paper.)

3. Results

Denoising
  • GDN is a better transformation than GSM, obtains higher PSNR.

Later, GDN is used for numerous deep image/video codec.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.