Video Captioning On Tvc
评估指标
BLEU-4
CIDEr
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | |||
|---|---|---|---|---|
| VAST | 19.9 | 74.1 | VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset | |
| COSA | 18.8 | 70.7 | COSA: Concatenated Sample Pretrained Vision-Language Foundation Model |
0 of 2 row(s) selected.