| Ours + ResNext101 BERT | 84.53 | Pose And Joint-Aware Action Recognition | |
| OmniSource (SlowOnly-8x8-R101-RGB + I3D Flow) | 83.8 | Omni-sourced Webly-supervised Learning for Video Recognition | |
| PERF-Net (distilled S3D-G) | 83.2 | PERF-Net: Pose Empowered RGB-Flow Net | - |
| CCS + TSN (ImageNet+Kinetics pretrained) | 81.9 | Cooperative Cross-Stream Network for Discriminative Action Representation | - |
| RepFlow-50 ([2+1]D CNN, FcF, Non-local block) | 81.1 | Representation Flow for Action Recognition | |
| MARS+RGB+FLow (64 frames, Kinetics pretrained) | 80.9 | MARS: Motion-Augmented RGB Stream for Action Recognition | - |
| Two-Stream I3D (Imagenet+Kinetics pre-training) | 80.7 | Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset | |