HyperAIHyperAI

Command Palette

Search for a command to run...

GroundingME Complex Scene Understanding Evaluation Dataset

Date

6 hours ago

Organization

The University of Hong Kong
Tsinghua University
Xiaomi

Paper URL

2512.17495

License

Other

GroundingME is a visual reference evaluation dataset for multimodal large language models (MLLMs), released in 2025 by Tsinghua University in collaboration with Xiaomi and the University of Hong Kong, among other institutions. Related research papers include... GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional EvaluationThe aim is to systematically evaluate the model's ability to accurately map natural language to visual targets in real-world complex scenarios, with particular attention to understanding and safety performance in situations involving ambiguous references, complex spatial relationships, small targets, occlusion, and unreferentiality.

This dataset contains 1,005 evaluation samples. The images are sourced from two high-quality datasets, SA-1B and HR-Bench, and only the original images were used to construct the tasks to avoid data contamination. The samples cover four primary task categories: discriminative reference (204 samples, 20.31 TP3T), spatial relationship understanding (300 samples, 29.91 TP3T), restricted visibility scenes (300 samples, 29.91 TP3T), and non-referential rejection task (201 samples, 20.01 TP3T), further subdivided into 12 secondary sub-tasks with a balanced overall distribution. The dataset involves 241 real-world object classes. There are a large number of objects of the same class in a single image, and object instances usually occupy a small proportion of the image. The length of the language descriptions is significantly longer than existing reference datasets, significantly increasing the difficulty of visual reference tasks from multiple dimensions.

Dataset Example

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp