HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

YOLO-Former: YOLO Shakes Hand With ViT

Javad Khoramdel Ahmad Moori Yasamin Borhani Armin Ghanbarzadeh Esmaeil Najafi

YOLO-Former: YOLO Shakes Hand With ViT

Abstract

The proposed YOLO-Former method seamlessly integrates the ideas of transformer and YOLOv4 to create a highly accurate and efficient object detection system. The method leverages the fast inference speed of YOLOv4 and incorporates the advantages of the transformer architecture through the integration of convolutional attention and transformer modules. The results demonstrate the effectiveness of the proposed approach, with a mean average precision (mAP) of 85.76\% on the Pascal VOC dataset, while maintaining high prediction speed with a frame rate of 10.85 frames per second. The contribution of this work lies in the demonstration of how the innovative combination of these two state-of-the-art techniques can lead to further improvements in the field of object detection.

Benchmarks

BenchmarkMethodologyMetrics
object-detection-on-pascal-voc-2007YOLO-Former
MAP: 86.01%
object-detection-on-pascal-voc-2012YOLO-Former
MAP: 86.01

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
YOLO-Former: YOLO Shakes Hand With ViT | Papers | HyperAI