Audio Classification On Esc 50

评估指标

Top-1 Accuracy

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
OmniVec299.1OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning-
InternVideo298.6InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
OmniVec98.4OmniVec: Learning robust representations with cross modal sharing-
BEATs98.1BEATs: Audio Pre-Training with Acoustic Tokenizers
mn40_as97.45Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
M2D-CLAP/0.797.4M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
DyMN-L97.4Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
M2D-AS/0.797.2Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
HTS-AT97.0HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
EAT-M96.3End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
LHGNN96.2LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging-
ERANN-2-596.1ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition-
M2D/0.796.0Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
EAT96.0EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Audio Spectrogram Transformer95.7AST: Audio Spectrogram Transformer
EAT-S95.25End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
EAT-S (scratch)92.15End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
SepTr + LeRaC91.58Learning Rate Curriculum
SepTr91.13SepTr: Separable Transformer for Audio Spectrogram Processing
Multi-Format Contrastive90.5Multi-Format Contrastive Learning of Audio Representations-
0 of 27 row(s) selected.
Audio Classification On Esc 50 | SOTA | HyperAI超神经