3 months ago

Self-supervised Sparse Representation for Video Anomaly Detection

{Tyng-Luh Liu Chiou-Shann Fuh Ding-Jie Chen He-Yen Hsieh* Jhih-Ciang Wu*}

Abstract

Video anomaly detection (VAD) aims at localizing unexpected actions or activities in a video sequence. Existing mainstream VAD techniques are based on either the one-class formulation, which assumes all training data are normal, or weakly-supervised, which requires only video-level normal/anomaly labels. To establish a unified approach to solving the two VAD settings, we introduce a self-supervised sparse representation (S3R) framework that models the concept of anomaly at feature level by exploring the synergy between dictionary-based representation and self-supervised learning. With the learned dictionary, S3R facilitates two coupled modules, en-Normal and de-Normal, to reconstruct snippet-level features and filter out normal-event features. The self-supervised techniques also enable generating samples of pseudo normal/anomaly to train the anomaly detector. We demonstrate with extensive experiments that S3R achieves new state-of-the-art performances on popular benchmark datasets for both one-class and weakly-supervised VAD tasks. Our code is publicly available at https://github.com/louisYen/S3R.

Benchmarks

Benchmark	Methodology	Metrics
anomaly-detection-in-surveillance-videos-on	S3R	ROC AUC: 85.99
anomaly-detection-in-surveillance-videos-on-1	S3R	AUC-ROC: 97.48
anomaly-detection-in-surveillance-videos-on-2	S3R (without audio imformation)	AP: 80.26
weakly-supervised-video-anomaly-detection-on	S3R	AUC-ROC: 97.48

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette