Brief Review — Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

Exponential Linear Unit (ELU), Outperforms ReLUs, LReLUs, and SReLUs

3 min readNov 15, 2022

--

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), ELU, by Johannes Kepler University,
2016 ICLR, Over 5000 Citations (Sik-Ho Tsang @ Medium)
Image Classification, Autoencoder, Activation Function, ReLU, Leaky ReLU

Exponential Linear Units (ELUs) are proposed as activation function.

Outline

Exponential Linear Unit (ELU)
Results

1. Exponential Linear Unit (ELU)

**The rectified linear unit (ReLU), the** **leaky ReLU** (**LReLU, α= 0.1), the shifted ReLUs (SReLUs), and the exponential linear unit (ELU, α = 1.0).**

The ELU hyperparameter α controls the value to which an ELU saturates for negative net inputs:

2. Results

2.1. MNIST

**(a): median of the average unit activation for different activation functions. (b): Training cross entropy loss.**

The network had eight hidden layers of 128 units each.

ELUs stay have smaller median throughout the training process. The training error of ELU networks decreases much more rapidly than for the other networks.

2.2. Autoencoder

**Autoencoder training on MNIST: Reconstruction error for the test and training data set over epochs, using different activation functions and learning rates.**

The encoder part consisted of four fully connected hidden layers with sizes 1000, 500, 250 and 30, respectively. The decoder part was symmetrical to the encoder.

ELUs outperform the competing activation functions in terms of training / test set reconstruction error for all learning rates.

2.3. CIFAR-100

**Comparison of** **ReLU**s, **LReLUs, and SReLUs on CIFAR-100. (a-c) show the training loss, (d-f) the test classification error.**

The CNN for these CIFAR-100 experiments consists of 11 convolutional layers.

ELU networks achieved lowest test error and training loss.

2.4. CIFAR-10 & CIFAR-100

**Comparison of ELU networks and other CNNs on CIFAR-10 and CIFAR-100.**

The CNN architecture is more sophisticated than in the previous subsection and consists of 18 convolutional layers.
ELU-networks are the second best on CIFAR-10 with a test error of 6.55% but still they are among the top 10 best results reported for CIFAR-10. ELU networks performed best on CIFAR-100 with a test error of 24.28%. This is the best published result on CIFAR-100.

2.5. ImageNet

A 15 layer CNN with SPP layer, originated in SPPNet, is used.

The ELU-network already reaches the 20% top-5 error after 160k iterations, while the ReLU network needs 200k iterations to reach the same error rate.

Reference

[2016 ICLR] [ELU]
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

1.1. Image Classification

1989 … 2016 [ELU] … 2022 [ConvNeXt] [PVTv2] [ViT-G] [AS-MLP] [ResTv2]

Brief Review — Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

Exponential Linear Unit (ELU), Outperforms ReLUs, LReLUs, and SReLUs

Outline

1. Exponential Linear Unit (ELU)

2. Results

2.1. MNIST

2.2. Autoencoder

2.3. CIFAR-100

2.4. CIFAR-10 & CIFAR-100

2.5. ImageNet

Reference

1.1. Image Classification

My Other Previous Paper Readings

Written by Sik-Ho Tsang

No responses yet