Command Palette
Search for a command to run...
Xinshun Wang Qiongjie Cui Chen Chen Shen Zhao Mengyuan Liu

Abstract
In recent years, Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory. Recently, MLP-Mixer, initially developed for vision tasks, has been leveraged into human motion prediction as a promising alternative to GCNs, which achieves both better performance and better efficiency than GCNs. Unlike GCNs, which can explicitly capture human skeleton's bone-joint structure by representing it as a graph with edges and nodes, MLP-Mixer relies on fully connected layers and thus cannot explicitly model such graph-like structure of human's. To break this limitation of MLP-Mixer's, we propose \textit{Graph-Guided Mixer}, a novel approach that equips the original MLP-Mixer architecture with the capability to model graph structure. By incorporating graph guidance, our \textit{Graph-Guided Mixer} can effectively capture and utilize the specific connectivity patterns within human skeleton's graph representation. In this paper, first we uncover a theoretical connection between MLP-Mixer and GCN that is unexplored in existing research. Building on this theoretical connection, next we present our proposed \textit{Graph-Guided Mixer}, explaining how the original MLP-Mixer architecture is reinvented to incorporate guidance from graph structure. Then we conduct an extensive evaluation on the Human3.6M, AMASS, and 3DPW datasets, which shows that our method achieves state-of-the-art performance.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| human-pose-forecasting-on-human36m | GraphMixer | Average MPJPE (mm) @ 1000 ms: 108.6 Average MPJPE (mm) @ 400ms: 56.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.