HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence

Zhang Junyi ; Herrmann Charles ; Hur Junhwa ; Cabrera Luisa Polania ; Jampani Varun ; Sun Deqing ; Yang Ming-Hsuan

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot
  Semantic Correspondence

Abstract

Text-to-image diffusion models have made significant advances in generatingand editing high-quality images. As a result, numerous approaches have exploredthe ability of diffusion model features to understand and process single imagesfor downstream tasks, e.g., classification, semantic segmentation, andstylization. However, significantly less is known about what these featuresreveal across multiple, different images and objects. In this work, we exploitStable Diffusion (SD) features for semantic and dense correspondence anddiscover that with simple post-processing, SD features can performquantitatively similar to SOTA representations. Interestingly, the qualitativeanalysis reveals that SD features have very different properties compared toexisting representation learning features, such as the recently releasedDINOv2: while DINOv2 provides sparse but accurate matches, SD features providehigh-quality spatial information but sometimes inaccurate semantic matches. Wedemonstrate that a simple fusion of these two features works surprisingly well,and a zero-shot evaluation using nearest neighbors on these fused featuresprovides a significant performance gain over state-of-the-art methods onbenchmark datasets, e.g., SPair-71k, PF-Pascal, and TSS. We also show thatthese correspondences can enable interesting applications such as instanceswapping in two images.

Code Repositories

Junyi42/sd-dino
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
dense-pixel-correspondence-estimation-on-tssSD+DINO (Zero-shot)
Average PCK@0.05: 79.7
semantic-correspondence-on-pf-pascalSD+DINO (Supervised)
PCK: 93.6
semantic-correspondence-on-spair-71kSD+DINO (Supervised)
PCK: 74.6
semantic-correspondence-on-spair-71kSD+DINO (Zero-shot)
PCK: 64.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence | Papers | HyperAI