
摘要
我们介绍了用于视频识别的SlowFast网络。该模型包括(i)一个以低帧率运行的慢路径,用于捕捉空间语义;(ii)一个以高帧率运行的快路径,用于在精细的时间分辨率下捕捉运动。通过减少通道容量,快路径可以被设计得非常轻量级,但仍能学习对视频识别有用的时域信息。我们的模型在视频中的动作分类和检测任务上均表现出色,并且SlowFast概念的具体贡献带来了显著的性能提升。我们在主要的视频识别基准数据集Kinetics、Charades和AVA上报告了最先进的准确率。代码已发布在:https://github.com/facebookresearch/SlowFast
代码仓库
youngjun0627/movie-rating
pytorch
GitHub 中提及
facebookresearch/SlowFast
官方
pytorch
GitHub 中提及
LukasHedegaard/co3d
pytorch
GitHub 中提及
chaitanyadwivedii/3D-Attention-is-All-You-Need
pytorch
GitHub 中提及
jihun-kr/SlowFast
pytorch
GitHub 中提及
wangxiang1230/SSTAP
pytorch
GitHub 中提及
Ysp9714/SlowFastNet-keras
tf
GitHub 中提及
tianfr/mononerf
pytorch
GitHub 中提及
tonysy/PyAction
pytorch
GitHub 中提及
open-mmlab/mmaction2
pytorch
tianfr/semantic-flow
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| action-classification-on-charades | SlowFast (Kinetics-600 pretraining, NL) | MAP: 45.2 |
| action-classification-on-charades | SlowFast (Kinetics-600 pretraining) | MAP: 42.1 |
| action-classification-on-charades | SlowFast (Kinetics-400 pretraining, NL) | MAP: 42.5 |
| action-classification-on-kinetics-400 | SlowFast 16x8 (ResNet-101) | Acc@1: 78.9 Acc@5: 93.5 |
| action-classification-on-kinetics-400 | SlowFast 16x8 (ResNet-101 + NL) | Acc@5: 93.9 |
| action-classification-on-kinetics-400 | SlowFast 4x16 (ResNet-50) | Acc@1: 75.6 Acc@5: 92.1 |
| action-classification-on-kinetics-400 | SlowFast 8x8 (ResNet-101) | Acc@1: 77.9 Acc@5: 93.2 |
| action-classification-on-kinetics-400 | SlowFast 16x8 (ResNet-101 + NL) | Acc@1: 79.8 |
| action-classification-on-kinetics-400 | SlowFast 8x8 (ResNet-50) | Acc@1: 77 Acc@5: 92.6 |
| action-classification-on-kinetics-600 | SlowFast 8x8 (ResNet-50) | Top-1 Accuracy: 79.9 Top-5 Accuracy: 94.5 |
| action-classification-on-kinetics-600 | SlowFast 16x8 (ResNet-101 + NL) | Top-1 Accuracy: 81.8 Top-5 Accuracy: 95.1 |
| action-classification-on-kinetics-600 | SlowFast 16x8 (ResNet-101) | Top-1 Accuracy: 81.1 Top-5 Accuracy: 95.1 |
| action-classification-on-kinetics-600 | SlowFast 8x8 (ResNet-101) | Top-1 Accuracy: 80.4 Top-5 Accuracy: 94.8 |
| action-classification-on-kinetics-600 | SlowFast 4x16 (ResNet-50) | Top-1 Accuracy: 78.8 Top-5 Accuracy: 94 |
| action-recognition-in-videos-on-ava-v21 | SlowFast (Kinetics-400 pretraining) | mAP (Val): 26.3 |
| action-recognition-in-videos-on-ava-v21 | SlowFast++ (Kinetics-600 pretraining, NL) | mAP (Val): 28.3 |
| action-recognition-in-videos-on-ava-v21 | SlowFast (Kinetics-600 pretraining, NL) | mAP (Val): 27.3 |
| action-recognition-in-videos-on-ava-v21 | SlowFast (Kinetics-600 pretraining) | mAP (Val): 26.8 |
| action-recognition-in-videos-on-something | SlowFast | Top-1 Accuracy: 61.7 |
| action-recognition-on-ava-v2-2 | SlowFast, 4x16, R50 (Kinetics-400 pretraining) | mAP: 21.9 |
| action-recognition-on-ava-v2-2 | SlowFast, 8x8, R101 (Kinetics-400 pretraining) | mAP: 23.8 |
| action-recognition-on-ava-v2-2 | SlowFast, 16x8 R101+NL (Kinetics-600 pretraining) | mAP: 27.5 |
| action-recognition-on-ava-v2-2 | SlowFast, 8x8 R101+NL (Kinetics-600 pretraining) | mAP: 27.1 |
| action-recognition-on-diving-48 | SlowFast | Accuracy: 77.6 |
| action-recognition-on-h2o-2-hands-and-objects | SlowFast | Actions Top-1: 77.69 Hand Pose: No Object Label: No Object Pose: No RGB: Yes |