Command Palette
Search for a command to run...

Abstract
We introduce EmbeddingGemma, a new lightweight, open text embedding modelbased on the Gemma 3 language model family. Our innovative training recipestrategically captures knowledge from larger models via encoder-decoderinitialization and geometric embedding distillation. We improve modelrobustness and expressiveness with a spread-out regularizer, and ensuregeneralizability by merging checkpoints from varied, optimized mixtures.Evaluated on the Massive Text Embedding Benchmark (MTEB) across multilingual,English, and code domains, EmbeddingGemma (300M) achieves state-of-the-artresults. Notably, it outperforms prior top models, both proprietary and open,with fewer than 500M parameters, and provides performance comparable to modelsdouble its size, offering an exceptional performance-to-cost ratio. Remarkably,this lead persists when quantizing model weights or truncating embeddingoutputs. This makes EmbeddingGemma particularly well-suited for low-latency andhigh-throughput use cases such as on-device applications. We provide ablationstudies exploring our key design choices. We release EmbeddingGemma to thecommunity to promote further research.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.