HyperAIHyperAI

Command Palette

Search for a command to run...

TreeOfLife-10M Biological Image Dataset

Date

a year ago

Organization

Microsoft Research

Paper URL

arxiv.org

Join the Discord Community
Featured Image

With over 10 million images covering 454,000 taxa in the Tree of Life, TreeOfLife-10M is the largest dataset of ML-ready biological organism images and their associated taxonomic labels to date. It expands on the foundation established by existing high-quality datasets such as iNat21 and BIOSCAN-1M, and further incorporates new curated images from the Encyclopedia of Life (eol.org), which provide the majority of the data diversity in TreeOfLife-10M. Each image in TreeOfLife-10M is labeled to the most specific taxonomic level, as well as higher taxonomic levels in the Tree of Life (see theText Type). TreeOfLife-10M is generated for the purpose of training BioCLIP and future biologically based models.

The dataset can be used in multiple fields, including biodiversity research, species identification, natural language processing tasks, machine learning, and computer vision research.

This dataset was released in 2024 by Ohio State University, Microsoft Research and other institutions.BioCLIP: A Vision Foundation Model for the Tree of Life" is the best paper of CVPR 2024.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TreeOfLife-10M Biological Image Dataset | Datasets | HyperAI