Brief Review — Rectifier Nonlinearities Improve Neural Network Acoustic Models

Leaky ReLU, Converge Slightly Faster Than ReLU

  • Leaky ReLU, with small negative values as output when input is smaller than 0.
  • This is a paper from Andrew Ng research group.


  1. Leaky ReLU
  2. Results

1. Leaky ReLU

1.1. Tanh

Tanh (Figure from
  • The hyperbolic tangent (tanh) is as below:
  • where σ() is the tanh function, w(i) is the weight vector for the i-th hidden unit, and x is the input.

1.2. ReLU

ReLU (Figure from
  • Rectified Linear Unit (ReLU) is as shown above and equated below:
  • When the output is above 0, its partial derivative is 1. Thus vanishing gradients do not exist.

1.3. Leaky ReLU

Leaky ReLU (Figure Modified from
  • Leaky ReLU allows for a small, non-zero gradient when the unit is saturated and not active:

2. Results

Results for DNN systems in terms of frame-wise error metrics on the development set as well as word error rates (%) on the Hub5 2000 evaluation sets.
  • LVCSR experiments are performed on the 300 hour Switchboard conversational telephone speech corpus (LDC97S62).
  • DNNs with 2, 3, and 4 hidden layers are trained for all nonlinearity types.
  • The output layer is a standard softmax classifier, and cross entropy with no regularization serves as the loss function.

Leaky ReLU later is used in many other domains.


[2013 ICML] [Leaky ReLU]
Rectifier Nonlinearities Improve Neural Network Acoustic Models

2.1. Language Model / Sequence Model

(Some are not related to NLP, but I just group them here)

1991 2013 [Leaky ReLU] … 2020 [ALBERT] [GPT-3] [T5] [Pre-LN Transformer] [MobileBERT] [TinyBERT] [BART] [Longformer] [ELECTRA] [Megatron-LM]

My Other Previous Paper Readings



PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store