Brief Review — UL2R, U-PaLM: Transcending Scaling Laws with 0.1% Extra Compute

UL2R, Fine-Tune Using Objective, Formed U-PaLM

Sik-Ho Tsang
3 min readAug 31, 2023

--

U-PaLM, Fine-Tune Using Objective, More Efficient Than : Compute (training flops) versus Quality (average of 20+ NLP zero and few-shot tasks)


UL2R, U-PaLM, by Google
2022 arXiv v2 (Sik-Ho Tsang @ Medium)

LM Tuning / Prompting
2020 [] 2021 [] 2022 [] [] [] [] [] [] 2023 []

  • UL2Restore (UL2R) is proposed, which is a method to continue training a SOTA LLM (e.g., ) on a few more steps with ’s mixture-of-denoiser objective. UL2R substantially improves existing language models and their scaling curves with almost negligible extra computational costs and no new sources of data.
  • U-PaLM model family of 8B, 62B, and 540B scales, is established by training with UL2R. An approximately 2× computational savings rate is achieved.
  • Later, U-PaLM is further instruction-finetuned as .

Outline

  1. UL2Restore (UL2R)
  2. U-PaLM
  3. Results

1. UL2Restore (UL2R)

The key idea is UL2R or UL2Restore to continue training an existing causal language model with a mixture of new objectives — specifically, the training objective mixture.

  • This restoration is expected to only cost roughly 0.1% to 1% of the original training FLOPs and requires no new data sources, making it highly efficient and convenient.

The objective combines prefix language modeling and long-short span corruption (e.g., infilling) tasks, that can teach LLM to leverage bidirectional attention (i.e., PrefixLM) or leverage infilling-style pretraining.

2. U-PaLM

2.1. Improved Scaling Properties on Few-shot Learning

Computation cost (training flops) versus Quality (average of 20+ NLP zero and few-shot tasks).

U-PaLM substantially outperforms the original models both at 8B scale and 540B scale. Note that the dotted lines represent a pathway before and after UL2R training.

  • UL2R training improves the scaling curve of substantially, i.e., UL2R provides a more compute-efficient performance improvement compared to training the original models for longer with the standard causal language modeling objective.

2.2. BigBench

BigBench

U-PaLM outperforms on 19 out of the 21 tasks at 540B scale.

2.3. Reasoning

Reasoning

U-PaLM 540B outperforms both 540B and Minverva 540B.

2.4. Infilling Ability

Infilling Ability

With UL2R training which uses objective, which has the infilling objective, the second and third examples demonstrate U-PaLM’s ability to infill multiple slots.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.

No responses yet

Write a response