HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Alignment Enhancement Network for Fine-grained Visual Categorization

{Yutao Hu}

Abstract

Fine-grained visual categorization (FGVC) aims to automatically recognize objects from different sub-ordinate categories. Despite attracting considerable attention from both academia and industry, it remains a challenging task due to subtle visual differences among different classes. Cross-layer feature aggregation and crossimage pairwise learning become prevailing in improving the performance of FGVC by extracting discriminative class-specific features. However, they are still inefficient to fully use the cross-layer information based on the simple aggregation strategy, while existing pairwise learning methods also fail to explore long-range interactions between different images. To address these problems, we propose a novel Alignment Enhancement Network (AENet), including two-level alignments, Cross-layer Alignment (CLA) and Cross-image Alignment (CIA). The CLA module exploits the cross-layer relationship between low-level spatial information and highlevel semantic information, which contributes to cross-layer feature aggregation to improve the capacity of feature representation for input images. The new CIA module is further introduced to produce the aligned feature map, which can enhance the relevant information as well as suppress the irrelevant information across the whole spatial region. Our method is based on an underlying assumption that the aligned feature map should be closer to the inputs of CIA when they belong to the same category. Accordingly, we establish Semantic Affinity Loss to supervise the feature alignment within each CIA block. Experimental results on four challenging datasets show that the proposed AENet achieves the state-of-the-art results over prior arts.

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-cub-200AENet
Accuracy: 90.0%
fine-grained-image-classification-on-fgvcAENet
Accuracy: 94.5%
fine-grained-image-classification-on-stanfordAENet
Accuracy: 94.0%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Alignment Enhancement Network for Fine-grained Visual Categorization | Papers | HyperAI