Command Palette
Search for a command to run...
Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
Jiawei Liu; Zheng-Jun Zha; Wei Wu; Kecheng Zheng; Qibin Sun

Abstract
Video-based person re-identification aims to match pedestrians from video sequences across non-overlapping camera views. The key factor for video person re-identification is to effectively exploit both spatial and temporal clues from video sequences. In this work, we propose a novel Spatial-Temporal Correlation and Topology Learning framework (CTL) to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation. Specifically, CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body at multiple granularities as graph nodes. It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body. Moreover, a 3D graph convolution and a cross-scale graph convolution are designed, which facilitate direct cross-spacetime and cross-scale information propagation for capturing hierarchical spatial-temporal dependencies and structural information. By jointly performing the two convolutions, CTL effectively mines comprehensive clues that are complementary with appearance information to enhance representational capacity. Extensive experiments on two video benchmarks have demonstrated the effectiveness of the proposed method and the state-of-the-art performance.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| video-deinterlacing-on-msu-deinterlacer | ST-Deint | FPS on CPU: 2.7 PSNR: 40.869 SSIM: 0.964 Subjective: 0.550 VMAF: 94.36 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.