Zero Shot Action Recognition On Kinetics

评估指标

Top-1 Accuracy
Top-5 Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
TC-CLIP78.195.7Leveraging Temporal Contextualization for Video Action Recognition
IMP-MoE-L76.8-Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception-
OST75.194.6OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
MAXI71.6-MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
OTI(ViT-L/14)70.6-Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
VideoCoCa70.188.9VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners-
Text4Vis68.990.3Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
BIKE68.591.1Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
X-CLIP65.286.1Expanding Language-Image Pretrained Models for General Video Recognition
LanguageBind64.185.7LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
LoCATe-GAT58.7-LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition-
JigsawNet45.978.8Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions-
ER-ZSAR (ST+Obj)42.173.1Elaborative Rehearsal for Zero-shot Action Recognition
ER-ZSAR (ST)37.169.3Elaborative Rehearsal for Zero-shot Action Recognition
DEVISE23.851.0DeViSE: A Deep Visual-Semantic Embedding Model-
DEM23.649.5Learning a Deep Embedding Model for Zero-Shot Learning
ALE23.450.3Label-Embedding for Image Classification
ESZSL22.948.3An embarrassingly simple approach to zero-shot learning-
GCN22.349.7All About Knowledge Graphs for Actions-
SJE(Word Embedding)22.348.2Evaluation of Output Embeddings for Fine-Grained Image Classification
0 of 20 row(s) selected.
Zero Shot Action Recognition On Kinetics | SOTA | HyperAI超神经