HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Lipreading using Temporal Convolutional Networks

Brais Martinez Pingchuan Ma Stavros Petridis Maja Pantic

Lipreading using Temporal Convolutional Networks

Abstract

Lip-reading has attracted a lot of research attention lately thanks to advances in deep learning. The current state-of-the-art model for recognition of isolated words in-the-wild consists of a residual network and Bidirectional Gated Recurrent Unit (BGRU) layers. In this work, we address the limitations of this model and we propose changes which further improve its performance. Firstly, the BGRU layers are replaced with Temporal Convolutional Networks (TCN). Secondly, we greatly simplify the training procedure, which allows us to train the model in one single stage. Thirdly, we show that the current state-of-the-art methodology produces models that do not generalize well to variations on the sequence length, and we addresses this issue by proposing a variable-length augmentation. We present results on the largest publicly-available datasets for isolated word recognition in English and Mandarin, LRW and LRW1000, respectively. Our proposed model results in an absolute improvement of 1.2% and 3.2%, respectively, in these datasets which is the new state-of-the-art performance.

Code Repositories

Yondijr/FlowerPower
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
lipreading-on-lip-reading-in-the-wild3D Conv + ResNet-18 + MS-TCN
Top-1 Accuracy: 85.30
lipreading-on-lrw-1000-13D Conv + ResNet-18 + MS-TCN
Top-1 Accuracy: 41.4%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lipreading using Temporal Convolutional Networks | Papers | HyperAI