HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
文本到视频生成
Text To Video Generation On Msr Vtt
Text To Video Generation On Msr Vtt
评估指标
CLIPSIM
FID
FVD
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
CLIPSIM
FID
FVD
Paper Title
Repository
PixelDance
0.3125
-
381
Make Pixels Dance: High-Dynamic Video Generation
-
VideoPoet
0.3123
-
213
VideoPoet: A Large Language Model for Zero-Shot Video Generation
-
Show-1
0.3072
13.08
538
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Make-A-Video
0.3049
13.17
-
Make-A-Video: Text-to-Video Generation without Text-Video Data
Video-LaVIT
0.3012
11.27
188.36
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
TF-T2V
0.2991
8.19
441
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
HiGen
0.2947
8.60
406
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
VideoComposer
0.2932
-
580
VideoComposer: Compositional Video Synthesis with Motion Controllability
ModelScopeT2V
0.2930
11.09
550
ModelScope Text-to-Video Technical Report
Video LDM
0.2929
-
-
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Snap Video (512x288)
0.2793
-
104.0
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
-
Snap Video (288×288)
0.2793
-
110.4
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
-
MMVG
0.2644
23.4
-
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
CogVideo (English)
0.2631
23.59
-
Make-A-Video: Text-to-Video Generation without Text-Video Data
CogVideo (Chinese)
0.2614
-
-
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
NUWA
0.2439
47.68
-
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
GODIVA
0.2402
-
-
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
MagicVideo
-
36.5
998
MagicVideo: Efficient Video Generation With Latent Diffusion Models
-
0 of 18 row(s) selected.
Previous
Next
Text To Video Generation On Msr Vtt | SOTA | HyperAI超神经