HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Attention-based Extraction of Structured Information from Street View Imagery

Zbigniew Wojna; Alex Gorban; Dar-Shyang Lee; Kevin Murphy; Qian Yu; Yeqing Li; Julian Ibarz

Attention-based Extraction of Structured Information from Street View Imagery

Abstract

We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84.2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith'16), which achieved 72.46%. Furthermore, our new method is much simpler and more general than the previous approach. To demonstrate the generality of our model, we show that it also performs well on an even more challenging dataset derived from Google Street View, in which the goal is to extract business names from store fronts. Finally, we study the speed/accuracy tradeoff that results from using CNN feature extractors of different depths. Surprisingly, we find that deeper is not always better (in terms of accuracy, as well as speed). Our resulting model is simple, accurate and fast, allowing it to be used at scale on a variety of challenging real-world text extraction problems.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
optical-character-recognition-on-fsns-testAttentionOCR_Inception-resnet-v2_Location
Sequence error: 15.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Attention-based Extraction of Structured Information from Street View Imagery | Papers | HyperAI