Command Palette
Search for a command to run...
Zhang Yang Gong Boqing Shah Mubarak

Abstract
The well-known word analogy experiments show that the recent word vectorscapture fine-grained linguistic regularities in words by linear vector offsets,but it is unclear how well the simple vector offsets can encode visualregularities over words. We study a particular image-word relevance relation inthis paper. Our results show that the word vectors of relevant tags for a givenimage rank ahead of the irrelevant tags, along a principal direction in theword vector space. Inspired by this observation, we propose to solve imagetagging by estimating the principal direction for an image. Particularly, weexploit linear mappings and nonlinear deep neural networks to approximate theprincipal direction from an input image. We arrive at a quite versatile taggingmodel. It runs fast given a test image, in constant time w.r.t.\ the trainingset size. It not only gives superior performance for the conventional taggingtask on the NUS-WIDE dataset, but also outperforms competitive baselines onannotating images with previously unseen tags
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multi-label-zero-shot-learning-on-nus-wide | fast0tag | mAP: 15.1 |
| multi-label-zero-shot-learning-on-open-images | Fast0tag | MAP: 41.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.