HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Surgical Triplet Recognition via Diffusion Model

Liu Daochang ; Hu Axel ; Shah Mubarak ; Xu Chang

Surgical Triplet Recognition via Diffusion Model

Abstract

Surgical triplet recognition is an essential building block to enablenext-generation context-aware operating rooms. The goal is to identify thecombinations of instruments, verbs, and targets presented in surgical videoframes. In this paper, we propose DiffTriplet, a new generative framework forsurgical triplet recognition employing the diffusion model, which predictssurgical triplets via iterative denoising. To handle the challenge of tripletassociation, two unique designs are proposed in our diffusion framework, i.e.,association learning and association guidance. During training, we optimize themodel in the joint space of triplets and individual components to capture thedependencies among them. At inference, we integrate association constraintsinto each update of the iterative denoising process, which refines the tripletprediction using the information of individual components. Experiments on theCholecT45 and CholecT50 datasets show the superiority of the proposed method inachieving a new state-of-the-art performance for surgical triplet recognition.Our codes will be released.

Benchmarks

BenchmarkMethodologyMetrics
action-triplet-recognition-on-cholect45-crossDiffTriplet
mAP: 40.2±1.9
action-triplet-recognition-on-cholect50-cross-1DiffTriplet
mAP: 40.3±2.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Surgical Triplet Recognition via Diffusion Model | Papers | HyperAI