Sik-Ho Tsang
May 15, 2022

--

Vision Transformer mainly classifies the main object within an image while the one you mentioned is an OCR task where texts are placed at different positions. But of course, transformer can be modified for OCR task. Hope I have answered your questions. :)

--

--

Sik-Ho Tsang
Sik-Ho Tsang

Written by Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

Responses (1)