HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Shashanka Venkataramanan Valentinos Pariza Mohammadreza Salehi Lukas Knobel Spyros Gidaris Elias Ramzi Andrei Bursuc Yuki M. Asano

Franca: Nested Matryoshka Clustering for Scalable Visual Representation
  Learning

Abstract

We present Franca (pronounced Fran-ka): free one; the first fully open-source(data, code, weights) vision foundation model that matches and in many casessurpasses the performance of state-of-the-art proprietary models, e.g., DINOv2,CLIP, SigLIPv2, etc. Our approach is grounded in a transparent trainingpipeline inspired by Web-SSL and uses publicly available data: ImageNet-21K anda subset of ReLAION-2B. Beyond model release, we tackle critical limitations inSSL clustering methods. While modern models rely on assigning image features tolarge codebooks via clustering algorithms like Sinkhorn-Knopp, they fail toaccount for the inherent ambiguity in clustering semantics. To address this, weintroduce a parameter-efficient, multi-head clustering projector based onnested Matryoshka representations. This design progressively refines featuresinto increasingly fine-grained clusters without increasing the model size,enabling both performance and memory efficiency. Additionally, we propose anovel positional disentanglement strategy that explicitly removes positionalbiases from dense representations, thereby improving the encoding of semanticcontent. This leads to consistent gains on several downstream benchmarks,demonstrating the utility of cleaner feature spaces. Our contributionsestablish a new standard for transparent, high-performance vision models andopen a path toward more reproducible and generalizable foundation models forthe broader AI community. The code and model checkpoints are available athttps://github.com/valeoai/Franca.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning | Papers | HyperAI