HyperAIHyperAI

Command Palette

Search for a command to run...

14 days ago

FineVision: Open Data Is All You Need

Luis Wiedmann Orr Zohar Amir Mahla Xiaohan Wang Rui Li Thibaud Frere Leandro von Werra Aritra Roy Gosthipaty Andrés Marafioti

FineVision: Open Data Is All You Need

Abstract

The advancement of vision-language models (VLMs) is hampered by a fragmentedlandscape of inconsistent and contaminated public datasets. We introduceFineVision, a meticulously collected, curated, and unified corpus of 24 millionsamples - the largest open resource of its kind. We unify more than 200 sourcesinto 185 subsets via a semi-automated, human-in-the-loop pipeline: automationperforms bulk ingestion and schema mapping, while reviewers audit mappings andspot-check outputs to verify faithful consumption of annotations, appropriateformatting and diversity, and safety; issues trigger targeted fixes andre-runs. The workflow further applies rigorous de-duplication within and acrosssources and decontamination against 66 public benchmarks. FineVision alsoencompasses agentic/GUI tasks with a unified action space; reviewers validateschemas and inspect a sample of trajectories to confirm executable fidelity.Models trained on FineVision consistently outperform those trained on existingopen mixtures across a broad evaluation suite, underscoring the benefits ofscale, data hygiene, and balanced automation with human oversight. We releasethe corpus and curation tools to accelerate data-centric VLM research.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
FineVision: Open Data Is All You Need | Papers | HyperAI