3 months ago

FLAVA: A Foundational Language And Vision Alignment Model

Amanpreet Singh Ronghang Hu Vedanuj Goswami Guillaume Couairon Wojciech Galuba Marcus Rohrbach Douwe Kiela

Abstract

State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic pretraining for obtaining good performance on a variety of downstream tasks. Generally, such models are often either cross-modal (contrastive) or multi-modal (with earlier fusion) but not both; and they often only target specific modalities or tasks. A promising direction would be to use a single holistic universal model, as a "foundation", that targets all modalities at once -- a true vision and language foundation model should be good at vision tasks, language tasks, and cross- and multi-modal vision and language tasks. We introduce FLAVA as such a model and demonstrate impressive performance on a wide range of 35 tasks spanning these target modalities.

Code Repositories

social-ai-studio/matk

pytorch

Mentioned in GitHub

2024-MindSpore-1/Code2/tree/main/model-1/falcon

mindspore

apsdehal/flava-tutorials

Mentioned in GitHub

facebookresearch/multimodal

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
image-retrieval-on-coco	FLAVA (zero-shot)	recall@1: 38.38 recall@5: 67.47
image-retrieval-on-coco	CLIP (zero-shot)	recall@1: 33.29 recall@5: 62.47
image-to-text-retrieval-on-coco	FLAVA (ViT-B, zero-shot)	Recall@1: 42.74 Recall@5: 76.76

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

FLAVA: A Foundational Language And Vision Alignment Model

Amanpreet Singh Ronghang Hu Vedanuj Goswami Guillaume Couairon Wojciech Galuba Marcus Rohrbach Douwe Kiela

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters