3 months ago

MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments

&#xd6 zsoy Ege Pellegrini Chantal Czempiel Tobias Tristram Felix Yuan

Abstract

Operating rooms (ORs) are complex, high-stakes environments requiring preciseunderstanding of interactions among medical staff, tools, and equipment forenhancing surgical assistance, situational awareness, and patient safety.Current datasets fall short in scale, realism and do not capture the multimodalnature of OR scenes, limiting progress in OR modeling. To this end, weintroduce MM-OR, a realistic and large-scale multimodal spatiotemporal ORdataset, and the first dataset to enable multimodal scene graph generation.MM-OR captures comprehensive OR scenes containing RGB-D data, detail views,audio, speech transcripts, robotic logs, and tracking data and is annotatedwith panoptic segmentations, semantic scene graphs, and downstream task labels.Further, we propose MM2SG, the first multimodal large vision-language model forscene graph generation, and through extensive experiments, demonstrate itsability to effectively leverage multimodal inputs. Together, MM-OR and MM2SGestablish a new benchmark for holistic OR understanding, and open the pathtowards multimodal scene analysis in complex, high-stakes environments. Ourcode, and data is available at https://github.com/egeozsoy/MM-OR.

Code Repositories

egeozsoy/MM-OR

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
scene-graph-generation-on-4d-or	MM2SG	F1: 0.901
scene-graph-generation-on-mm-or	MM2SG	Macro F1: 0.529
video-panoptic-segmentation-on-4d-or	MM-OR-VPQ4	VPQ: 69.8
video-panoptic-segmentation-on-4d-or	MM-OR-VPQ8	VPQ: 69.2
video-panoptic-segmentation-on-mm-or	MM-OR-VPQ4	VPQ: 67.0
video-panoptic-segmentation-on-mm-or	MM-OR-VPQ8	VPQ: 66.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette