4 个月前

用于视频识别的SlowFast网络

用于视频识别的SlowFast网络

摘要

我们介绍了用于视频识别的SlowFast网络。该模型包括(i)一个以低帧率运行的慢路径,用于捕捉空间语义;(ii)一个以高帧率运行的快路径,用于在精细的时间分辨率下捕捉运动。通过减少通道容量,快路径可以被设计得非常轻量级,但仍能学习对视频识别有用的时域信息。我们的模型在视频中的动作分类和检测任务上均表现出色,并且SlowFast概念的具体贡献带来了显著的性能提升。我们在主要的视频识别基准数据集Kinetics、Charades和AVA上报告了最先进的准确率。代码已发布在:https://github.com/facebookresearch/SlowFast

基准测试

基准方法指标
action-classification-on-charadesSlowFast (Kinetics-600 pretraining, NL)
MAP: 45.2
action-classification-on-charadesSlowFast (Kinetics-600 pretraining)
MAP: 42.1
action-classification-on-charadesSlowFast (Kinetics-400 pretraining, NL)
MAP: 42.5
action-classification-on-kinetics-400SlowFast 16x8 (ResNet-101)
Acc@1: 78.9
Acc@5: 93.5
action-classification-on-kinetics-400SlowFast 16x8 (ResNet-101 + NL)
Acc@5: 93.9
action-classification-on-kinetics-400SlowFast 4x16 (ResNet-50)
Acc@1: 75.6
Acc@5: 92.1
action-classification-on-kinetics-400SlowFast 8x8 (ResNet-101)
Acc@1: 77.9
Acc@5: 93.2
action-classification-on-kinetics-400SlowFast 16x8 (ResNet-101 + NL)
Acc@1: 79.8
action-classification-on-kinetics-400SlowFast 8x8 (ResNet-50)
Acc@1: 77
Acc@5: 92.6
action-classification-on-kinetics-600SlowFast 8x8 (ResNet-50)
Top-1 Accuracy: 79.9
Top-5 Accuracy: 94.5
action-classification-on-kinetics-600SlowFast 16x8 (ResNet-101 + NL)
Top-1 Accuracy: 81.8
Top-5 Accuracy: 95.1
action-classification-on-kinetics-600SlowFast 16x8 (ResNet-101)
Top-1 Accuracy: 81.1
Top-5 Accuracy: 95.1
action-classification-on-kinetics-600SlowFast 8x8 (ResNet-101)
Top-1 Accuracy: 80.4
Top-5 Accuracy: 94.8
action-classification-on-kinetics-600SlowFast 4x16 (ResNet-50)
Top-1 Accuracy: 78.8
Top-5 Accuracy: 94
action-recognition-in-videos-on-ava-v21SlowFast (Kinetics-400 pretraining)
mAP (Val): 26.3
action-recognition-in-videos-on-ava-v21SlowFast++ (Kinetics-600 pretraining, NL)
mAP (Val): 28.3
action-recognition-in-videos-on-ava-v21SlowFast (Kinetics-600 pretraining, NL)
mAP (Val): 27.3
action-recognition-in-videos-on-ava-v21SlowFast (Kinetics-600 pretraining)
mAP (Val): 26.8
action-recognition-in-videos-on-somethingSlowFast
Top-1 Accuracy: 61.7
action-recognition-on-ava-v2-2SlowFast, 4x16, R50 (Kinetics-400 pretraining)
mAP: 21.9
action-recognition-on-ava-v2-2SlowFast, 8x8, R101 (Kinetics-400 pretraining)
mAP: 23.8
action-recognition-on-ava-v2-2SlowFast, 16x8 R101+NL (Kinetics-600 pretraining)
mAP: 27.5
action-recognition-on-ava-v2-2SlowFast, 8x8 R101+NL (Kinetics-600 pretraining)
mAP: 27.1
action-recognition-on-diving-48SlowFast
Accuracy: 77.6
action-recognition-on-h2o-2-hands-and-objectsSlowFast
Actions Top-1: 77.69
Hand Pose: No
Object Label: No
Object Pose: No
RGB: Yes

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
用于视频识别的SlowFast网络 | 论文 | HyperAI超神经