Command Palette
Search for a command to run...
Zeynep Akata; Scott Reed; Daniel Walter; Honglak Lee; Bernt Schiele

Abstract
Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| few-shot-image-classification-on-cub-200-0 | SJE | Accuracy: 50.1% |
| few-shot-image-classification-on-cub-200-2011-1 | SJE | Top-1 Accuracy: 50.1% |
| few-shot-image-classification-on-cub-200-50 | SJE Akata et al. (2015) | Accuracy: 50.1 |
| zero-shot-action-recognition-on-hmdb51 | SJE(word embedding) | Top-1 Accuracy: 13.3 |
| zero-shot-action-recognition-on-kinetics | SJE(Word Embedding) | Top-1 Accuracy: 22.3 Top-5 Accuracy: 48.2 |
| zero-shot-action-recognition-on-olympics | SJE(Atrribute) | Top-1 Accuracy: 47.5 |
| zero-shot-action-recognition-on-olympics | SJE(Word Embedding) | Top-1 Accuracy: 28.6 |
| zero-shot-action-recognition-on-ucf101 | SJE(Attribute) | Top-1 Accuracy: 12.0 |
| zero-shot-action-recognition-on-ucf101 | SJE(Word Embedding) | Top-1 Accuracy: 9.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.