HyperAI

Abstract

Typical text recognition methods rely on an encoder-decoder structure, inwhich the encoder extracts features from an image, and the decoder producesrecognized text from these features. In this study, we propose a simpler andmore effective method for text recognition, known as the Decoder-onlyTransformer for Optical Character Recognition (DTrOCR). This method uses adecoder-only Transformer to take advantage of a generative language model thatis pre-trained on a large corpus. We examined whether a generative languagemodel that has been successful in natural language processing can also beeffective for text recognition in computer vision. Our experiments demonstratedthat DTrOCR outperforms current state-of-the-art methods by a large margin inthe recognition of printed, handwritten, and scene text in both English andChinese.

Abstract

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

DTrOCR: Decoder-only Transformer for Optical Character Recognition

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

DTrOCR: Decoder-only Transformer for Optical Character Recognition

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

DTrOCR: Decoder-only Transformer for Optical Character Recognition

Fujitake Masato

Abstract

Build AI with AI

HyperAI Newsletters