Brief Review — Andrew Ng, AI Minimalist: The Machine-Learning Pioneer Says Small is the New Big

Not Just About Data-Centric AI

Sik-Ho Tsang
2 min readSep 15, 2022
An Interview of Andrew Ng in IEEE Spectrum
  • I’ve taken many of his AI courses in deeplearning.ai and was a mentor in one of his courses in Coursera. His courses really strengthen me a lot about the deep learning knowledge.
  • This time, I would like to share an article that I’ve read recently, from IEEE Spectrum Magazine in April 2022, namely “Andrew Ng, AI Minimalist: The Machine-Learning Pioneer Says Small is the New Big”
  • IEEE Spectrum Magazine is a monthly magazine, which talks about technology of all kinds. It has impact factor of 3.578.
  • In this article, Andrew Ng has shared a lot of his valuable visions and broad views about AI, e.g.: NLP, CV, semiconductor manufacturers, and his company, even from his first NeurIPS workshop paper, to recent NeurIPS data centric AI workshop!

1. Foundation Model for Video

  • He mentioned that, there are many research works developing the foundation model in NLP, like GPT-3.

It is time to have the foundation model for video despite there are practical difficulties of it.

The machine-learning pioneer says small is the new big

2. Data-Centric AI

  • Over the past recent years, Andrew has been promoting Data-Centric AI.
  • “For example, if you have 10,000 images where 30 images are of one class, and those 30 images are labeled inconsistently, … So you can very quickly relabel those images to be more consistent, and this leads to improvement in performance.” by Andrew Ng.

Data-Centric AI is the discipline of systematically engineering the data needed to successfully build an AI system.”

3. High Quality Data Help With Bias

  • He also mentioned an example, let say we got a model performance which is okay for most of the data set, but its performance is biased for just a subset of the data.
  • If we try to change the whole neural-network architecture to improve the performance on just that subset, it’s quite difficult.

We can engineer a subset of the data to address the problem in a much more targeted way.

Sometimes, it is crucial to IMPROVE THE DATA instead of improving the model architecture to boost the system performance.

(I’ve just shared a little bit of it. Please feel free to read the article directly if interested.)

Reference

[2022 IEEE Spectrum] [Small is the New Big]
Andrew Ng, AI Minimalist: The Machine-Learning Pioneer Says Small is the New Big

Data-Centric AI

2021 [CheXternal] [CheXtransfer] 2022 [Small is the New Big]

My Other Previous Paper Readings

--

--

Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.