Brief Review — Inflection-1

Pi.AI, Empowered by Inflection-1

3 min read3 days ago

--

**Inflection.AI** (Image from https://voicebot.ai/2022/03/11/deepmind-and-linkedin-co-founders-unveil-new-conversational-ai-startup-inflection-ai/)

Inflection-1, by Inflection AI
2023 Technical Memo (
Sik-Ho Tsang
@ Medium)
Large Langauge Model (LLM)
2020 … 2023 [GPT-4] [LLaMA] [Koala] [BloombergGPT] [GLM-130B] [UL2] [PaLM 2] [Llama 2] [MultiMedQA, HealthSearchQA, Med-PaLM] [Med-PaLM 2] [Flan 2022, Flan-T5] [AlphaCode 2] [Mistral 7B]
==== My Other Paper Readings Are Also Over Here ====

Last week, one of my colleagues has introduced me about Pi.AI which make me read about this Inflection-1 LLM technical memo.
In 2022, DeepMind and LinkedIn Co-Founders Unveil New Conversational AI Startup Inflection AI. The mission is to create personal AIs for everyone. In 2023, Inflection AI has invented Inflection-1 LLM, and Pi.AI is published and freely used, which is empowered by Inflection-1.
Yet, DeepMind co-founder Mustafa Suleyman has become the CEO of Microsoft AI recently.

Outline

Inflection-1
Results

1. Inflection-1 Benchmarking Results

To offer a fair comparison amongst models of varying sizes and training methods, foundation models are segmented into those pretrained using at most the FLOPs of Google’s PaLM-540B (approximately 10x GPT-3) and those which used more.
First compute class: Models in the former category are usually faster to serve and can be deployed more widely.
Second compute class: Models in the latter category tend to have the highest performance.
GPT-3.5 to the former category and GPT-4 to the latter.

Inflection-1 was trained on a large dataset using thousands of NVIDIA H100 GPUs, and is a model within the first compute class.

For Inflection-1, results without instruction tuning or RLHF are reported.

2. Results

2.1. Overview

**Overview of Inflection-1’s performance relative to LLaMA and GPT-3.5**

Inflection-1 outperforms GPT-3.5 and LLaMA-65B for the above 5 benchmarks.

2.2. Multitask Language Understanding (MMLU)

The proposed model outperforms all models in the first compute class including both GPT-3.5 and LLaMA.

2.3. Closed Book Question Answering

Inflection-1 is significantly better at Trivia Questions.

2.4. Others

The above results are shown in their website while the below results only shown in the technical memo.

**0-shot results on common sense benchmarks.**

Similar to OpenAI GPT-4, authors do not disclose their model size as we can see in all tables.

**Common sense benchmarks with comparison to** **GPT-4** **and** **PaLM 2**.

**BIG-Bench hard with** **Chain of Thought prompting**.

**Reading comprehension benchmark RACE along with LAMBADA.**

Inflection-1 outperforms or has similar performance with first compute class LLM in most of the tasks, only underperforms second compute class LLM such as GPT-4 and PaLM 2-L.

Brief Review — Inflection-1

Pi.AI, Empowered by Inflection-1

Outline

1. Inflection-1 Benchmarking Results

2. Results

2.1. Overview

2.2. Multitask Language Understanding (MMLU)

2.3. Closed Book Question Answering

2.4. Others

Written by Sik-Ho Tsang