Text To Image Generation On Coco

评估指标

FID
Inception score

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
StackGAN-v174.058.45StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StackGAN + OP55.3012.12Generating Multiple Objects at Spatially Distinct Locations
L-Verse45.8-L-Verse: Bidirectional Generation Between Image and Text
L-Verse-CC37.2-L-Verse: Bidirectional Generation Between Image and Text
AttnGAN (256 x 256) 35.223.3NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
AttnGAN + OP33.3524.76Generating Multiple Objects at Spatially Distinct Locations
DM-GAN32.6430.49DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
DM-GAN + VICTR32.3732.37VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Vanilla CM329.5-Retrieval-Augmented Multimodal Language Modeling-
AttnGAN + VICTR29.2628.18VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
DALL-E (12B)28-Retrieval-Augmented Multimodal Language Modeling-
DALL-E (256 x 256)27.517.9NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
CogView27.118.2CogView: Mastering Text-to-Image Generation via Transformers
CogView (256 x 256) 27.118.2NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite (zero-shot)26.9426.02LAFITE: Towards Language-Free Training for Text-to-Image Generation
DM-GAN (256 x 256) 26.032.2NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
OP-GAN24.7027.88Semantic Object Accuracy for Generative Text-to-Image Synthesis
CogView2(6B, Finetuned)24-CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
AttnGAN+CL23.9325.70Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (k=10, 256)21.8934.67FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
0 of 69 row(s) selected.