HyperAIHyperAI

Command Palette

Search for a command to run...

FoMER Bench Multimodal Evaluation Dataset

Join the Discord Community

*This dataset supports online use.Click here to jump.

FoMER Bench is a Foundational Model Embodied Reasoning (FoMER) benchmark released in 2025 by Mohamed bin Zayed University of Artificial Intelligence, Linköping University, and Australian National University.How Good are Foundation Models in Step-by-Step Embodied Reasoning?”, which aims to evaluate the reasoning ability of LMM in complex embodied decision-making scenarios.

This dataset contains over 1,100 examples, covering detailed step-by-step reasoning across 10 tasks and 8 embodied reasoning tasks. It encompasses three different robot types and multiple robot modes, enabling evaluation of LLM capabilities across various tasks, such as next-step action prediction, action affordance, physical common sense, temporal reasoning, tool use and manipulation, risk assessment, and robot navigation. The data includes multiple-choice questions (MCQs), true/false questions (TFs), and open-ended questions. Each example is accompanied by an input observation (video or image frame + text prompt), multiple candidate actions, and corresponding step-by-step reasoning traces.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
FoMER Bench Multimodal Evaluation Dataset | Datasets | HyperAI