HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation

Jin Youngwan ; Park Incheol ; Song Hanbin ; Ju Hyeongjin ; Nalcakan Yagiz ; Kim Shiho

Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image
  Translation

Abstract

This paper proposes Pix2Next, a novel image-to-image translation frameworkdesigned to address the challenge of generating high-quality Near-Infrared(NIR) images from RGB inputs. Our approach leverages a state-of-the-art VisionFoundation Model (VFM) within an encoder-decoder architecture, incorporatingcross-attention mechanisms to enhance feature integration. This design capturesdetailed global representations and preserves essential spectralcharacteristics, treating RGB-to-NIR translation as more than a simple domaintransfer problem. A multi-scale PatchGAN discriminator ensures realistic imagegeneration at various detail levels, while carefully designed loss functionscouple global context understanding with local feature preservation. Weperformed experiments on the RANUS dataset to demonstrate Pix2Next's advantagesin quantitative metrics and visual quality, improving the FID score by 34.81%compared to existing methods. Furthermore, we demonstrate the practical utilityof Pix2Next by showing improved performance on a downstream object detectiontask using generated NIR data to augment limited real NIR datasets. Theproposed approach enables the scaling up of NIR datasets without additionaldata acquisition or annotation efforts, potentially accelerating advancementsin NIR-based computer vision applications.

Code Repositories

Yonsei-STL/pix2next
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-to-image-translation-on-flirPix2Next
PSNR: 23.45
SSIM: 0.66

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation | Papers | HyperAI