HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Lip Sync Matters: A Novel Multimodal Forgery Detector

{Hsin-Min Wang Yu Tsao Yan-Tsung Peng Sarwar Khan Ammarah Hashmi Sahibzada Adil Shahzad}

Lip Sync Matters: A Novel Multimodal Forgery Detector

Abstract

Deepfake technology has advanced a lot, but it is a double-sided sword for the community. One can use it for beneficial purposes, such as restoring vintage content in old movies, or for nefarious purposes, such as creating fake footage to manipulate the public and distribute non-consensual pornography. A lot of work has been done to combat its improper use by detecting fake footage with good performance thanks to the availability of numerous public datasets and unimodal deep learning-based models. However, these methods are insufficient to detect multimodal manipulations, such as both visual and acoustic. This work proposes a novel lip-reading-based multi-modal Deepfake detection method called “Lip Sync Matters.” It targets high-level semantic features to exploit the mismatch between the lip sequence extracted from the video and the synthetic lip sequence generated from the audio by the Wav2lip model to detect forged videos. Experimental results show that the proposed method outperforms several existing unimodal, ensemble, and multimodal methods on the publicly available multimodal FakeAVCeleb dataset.

Benchmarks

BenchmarkMethodologyMetrics
deepfake-detection-on-fakeavceleb-1Multimodal Ensemble Model
Accuracy (%): 89
deepfake-detection-on-fakeavceleb-1AV-Lip-Sync Model
Accuracy (%): 94

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lip Sync Matters: A Novel Multimodal Forgery Detector | Papers | HyperAI