Brief Review — Tiny Models are the Computational Saver for Large Models

TinySaver, Adaptive Computational Reduction for Image Classification

Sik-Ho Tsang
4 min readSep 24, 2024

Tiny Models are the Computational Saver for Large Models
TinySaver
, by University College Dublin, and Polytechnique Hauts-de-France
2024 arXiv v3 (Sik-Ho Tsang @ Medium)

Image Classification
1989 … 2023
[Vision Permutator (ViP)] [ConvMixer] [CrossFormer++] [FastViT] [EfficientFormerV2] [MobileViTv2] [ConvNeXt V2] [SwiftFormer] 2024 [FasterViT] [CAS-ViT]
==== My Other Paper Readings Are Also Over Here ====

  • TinySaver, an early-exit-like dynamic model compression approach, is proposed.
  • It allows certain inputs to complete their inference processes early, thereby conserving computational resources.

Outline

  1. TinySaver
  2. Results

1. TinySaver

(a) Early Exit (EE), (b) Proposed TinySaver, (c) Mixture of Experts (MoE)

1.1. (a) Early Exit (EE)

  • (a) Early Exit (EE), e.g.: MSDNet, enables input samples to traverse different data paths.
  • EE-enabled system employs multiple exits, each attempting to produce outputs before reaching the final model head.
  • The primary drawback of EE models is the requirement for additional structural components, therefore it does not reduce model size.
  • These exit branches from the initial layers often struggle to perform satisfactorily.

Consequently, these limitations hinder the advancement and effectiveness of EE based methods.

1.2. (b) Proposed TinySaver

  • TinySaver, a system that leverages pretrained efficient models as a component of the EE framework.
  • The computational process initiates with a tiny saver model, which acts as the primary early exit point.
  • If the result from the saver model meets the acceptance criteria (step 1), the process halts.
  • Otherwise, the larger base model will be invoked (step 2).
  • All models are pre-trained using their original methods, eliminating the need for additional training.
  • It also effectively decouples early exits from specific applications, which make it low developement cost.

In simple sense, the probability of the prediction that matches the ground truth from Tiny model, can be used as its confidence level can serve as a prediction of accuracy. It it is smaller than a threshold, the larger base model is invoked.

  • (In the paper, there are equations to estimate computation saving ratio for exiting system. Please feel free to read the paper directly if interested.)

1.3. (c) Mixture of Experts (MoE)

  • Authors claim that TinySaver can also be explained as a specialized Mixture of Experts (MoE), where non-uniform pre-trained models are used as experts and the smaller expert also works as the router.

1.4. TinySaver With Extension with Early Exit (EE) Sequence

TinySaver With Extension with Early Exit (EE) Sequence
  • Similar to other EE-based models, the heads built on the ESN is not necessarily having the same task as the original model, providing rich flexibility.
  • However, ESN uniquely includes the tiny model, ensuring a lower bound of performance.

2. Results

2.1. Ablation Studies

TinySaver’s effectiveness in significantly reducing overall FLOPs, particularly in larger models. The TinySaver even makes some models surpass their original performance with certain threshold settings.

Table 2: Simulations are conducted by applying various savers to the same base model. A suitable tiny model is needed for each larger base model.

Table 3: The results reveal no substantial improvements using ESN compared to the plain TinySaver, suggesting that a tiny model is already highly efficient and effective.

2.2. SOTA Comparisons

As in Fig. 5(a), TinySaver outperforms EE-based methods significantly due to its model-agnostic feature.

As in Fig. 5(b), TinySaver surpasses scaling down model architectures directly.

2.3. Object Detection

Object Detection
  • The scores of detected boxes are as the confidence metric. Boxes with scores below 0.05 are filtered out and the average score of the remaining boxes is used as the model’s confidence for each sample.

Real-time YOLOv8 detectors are used as savers. Images not confidently predicted by YOLOv8 are passed to a larger model.

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.