HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

Abstract

The field of surgical computer vision has undergone considerablebreakthroughs in recent years with the rising popularity of deep neuralnetwork-based methods. However, standard fully-supervised approaches fortraining such models require vast amounts of annotated data, imposing aprohibitively high cost; especially in the clinical domain. Self-SupervisedLearning (SSL) methods, which have begun to gain traction in the generalcomputer vision community, represent a potential solution to these annotationcosts, allowing to learn useful representations from only unlabeled data.Still, the effectiveness of SSL methods in more complex and impactful domains,such as medicine and surgery, remains limited and unexplored. In this work, weaddress this critical need by investigating four state-of-the-art SSL methods(MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. Wepresent an extensive analysis of the performance of these methods on theCholec80 dataset for two fundamental and popular tasks in surgical contextunderstanding, phase recognition and tool presence detection. We examine theirparameterization, then their behavior with respect to training data quantitiesin semi-supervised settings. Correct transfer of these methods to surgery, asdescribed and conducted in this work, leads to substantial performance gainsover generic uses of SSL - up to 7.4% on phase recognition and 20% on toolpresence detection - as well as state-of-the-art semi-supervised phaserecognition approaches by up to 14%. Further results obtained on a highlydiverse selection of surgical datasets exhibit strong generalizationproperties. The code is available athttps://github.com/CAMMA-public/SelfSupSurg.

Code Repositories

camma-public/selfsupsurg
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
action-triplet-recognition-on-cholect50-1MoCo V2 Surg SSL - Rendezvous head
mAP: 35.7
semantic-segmentation-on-endoscapesMoCo V2 Surg SSL - DeepLabv3+ head
Mean F1: 73.2
surgical-phase-recognition-on-cholec80-1MoCo V2 Surg SSL - TCN head
F1: 81.6
surgical-phase-recognition-on-heicholeMoCo V2 Surg SSL - TCN head
F1: 64.7
surgical-tool-detection-on-cholec80MoCo V2 Surg SSL - FCN head
mAP: 93.5
surgical-tool-detection-on-heichole-benchmarkMoCo V2 Surg SSL - FCN head
mAP: 66.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision | Papers | HyperAI