HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
视听活动说话人检测
Audio Visual Active Speaker Detection On Ava
Audio Visual Active Speaker Detection On Ava
评估指标
validation mean average precision
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
validation mean average precision
Paper Title
Repository
LoCoNet+TalkNCE
95.5%
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning
LoCoNet + Laser
95.3%
LASER: Lip Landmark Assisted Speaker Detection for Robustness
LoCoNet
95.2%
LoCoNet: Long-Short Context Network for Active Speaker Detection
SPELL+
94.9%
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
UniCon+
94.5%
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
-
SPELL
94.2%
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
EASEE-50
94.1%
End-to-End Active Speaker Detection
Light-ASD
94.1%
A Light Weight Model for Active Speaker Detection
Extended UniCon
93.6%
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021
-
ASDNet
93.5%
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
GSCMIA
92.86%
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
TalkNet
92.3%
NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)
-
UniCon
92.0%
UniCon: Unified Context Network for Robust Active Speaker Detection
-
SA-uncertainty Fusion
91.9%
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion
-
VTP (visual only)
89.2%
Sub-word Level Lip Reading With Visual Attention
-
MAAS-TAN
88.8%
MAAS: Multi-modal Assignation for Active Speaker Detection
VGG-{LSTM+TCN} (ensemble)
87.8%
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)
-
Active Speakers in Context
87.1%
Active Speakers in Context
MAAS-LAN
85.1%
MAAS: Multi-modal Assignation for Active Speaker Detection
3D-ResNet-GRU
84.0%
Multi-Task Learning for Audio Visual Active Speaker Detection
-
0 of 20 row(s) selected.
Previous
Next
Audio Visual Active Speaker Detection On Ava | SOTA | HyperAI超神经