Command Palette
Search for a command to run...
{Ricardo Ñanculef Francisco Mena}

Abstract
Searching a large dataset to find elements that are similar to a sample object is a fundamental problem in computer science. Hashing algorithms deal with this problem by representing data with similarity-preserving binary codes that can be used as indices into a hash table. Recently, it has been shown that variational autoencoders (VAEs) can be successfully trained to learn such codes in unsupervised and semi-supervised scenarios. In this paper, we show that a variational autoencoder with binary latent variables leads to a more natural and effective hashing algorithm that its continuous counterpart. The model reduces the quantization error introduced by continuous formulations but is still trainable with standard back-propagation. Experiments on text retrieval tasks illustrate the advantages of our model with respect to previous art.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| text-retrieval-on-20-newsgroups | VDSH | Precision@100: 0.319 |
| text-retrieval-on-20-newsgroups | B-VAE | Precision@100: 0.441 |
| text-retrieval-on-reuters-21578 | B-VAE | Precision@100: 0.698 |
| text-retrieval-on-reuters-21578 | VDSH | Precision@100: 0.556 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.