HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
全站搜索…
⌘
K
首页
SOTA
文本到图像生成
Text To Image Generation On Coco
Text To Image Generation On Coco
评估指标
FID
Inception score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
FID
Inception score
Paper Title
Repository
StackGAN-v1
74.05
8.45
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StackGAN + OP
55.30
12.12
Generating Multiple Objects at Spatially Distinct Locations
L-Verse
45.8
-
L-Verse: Bidirectional Generation Between Image and Text
L-Verse-CC
37.2
-
L-Verse: Bidirectional Generation Between Image and Text
AttnGAN (256 x 256)
35.2
23.3
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
AttnGAN + OP
33.35
24.76
Generating Multiple Objects at Spatially Distinct Locations
DM-GAN
32.64
30.49
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
DM-GAN + VICTR
32.37
32.37
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Vanilla CM3
29.5
-
Retrieval-Augmented Multimodal Language Modeling
-
AttnGAN + VICTR
29.26
28.18
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
DALL-E (12B)
28
-
Retrieval-Augmented Multimodal Language Modeling
-
DALL-E (256 x 256)
27.5
17.9
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
CogView
27.1
18.2
CogView: Mastering Text-to-Image Generation via Transformers
CogView (256 x 256)
27.1
18.2
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite (zero-shot)
26.94
26.02
LAFITE: Towards Language-Free Training for Text-to-Image Generation
DM-GAN (256 x 256)
26.0
32.2
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
OP-GAN
24.70
27.88
Semantic Object Accuracy for Generative Text-to-Image Synthesis
CogView2(6B, Finetuned)
24
-
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
AttnGAN+CL
23.93
25.70
Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (k=10, 256)
21.89
34.67
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
0 of 69 row(s) selected.
Previous
Next