HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
音频分类
Audio Classification On Esc 50
Audio Classification On Esc 50
评估指标
Top-1 Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Top-1 Accuracy
Paper Title
Repository
OmniVec2
99.1
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
-
InternVideo2
98.6
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
OmniVec
98.4
OmniVec: Learning robust representations with cross modal sharing
-
BEATs
98.1
BEATs: Audio Pre-Training with Acoustic Tokenizers
mn40_as
97.45
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
M2D-CLAP/0.7
97.4
M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
DyMN-L
97.4
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
M2D-AS/0.7
97.2
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
HTS-AT
97.0
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
EAT-M
96.3
End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
LHGNN
96.2
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
-
ERANN-2-5
96.1
ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition
-
M2D/0.7
96.0
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
EAT
96.0
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Audio Spectrogram Transformer
95.7
AST: Audio Spectrogram Transformer
EAT-S
95.25
End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
EAT-S (scratch)
92.15
End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network
SepTr + LeRaC
91.58
Learning Rate Curriculum
SepTr
91.13
SepTr: Separable Transformer for Audio Spectrogram Processing
Multi-Format Contrastive
90.5
Multi-Format Contrastive Learning of Audio Representations
-
0 of 27 row(s) selected.
Previous
Next
Audio Classification On Esc 50 | SOTA | HyperAI超神经