Command Palette
Search for a command to run...
Dzmitry Bahdanau; Kyunghyun Cho; Yoshua Bengio

Abstract
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| bangla-spelling-error-correction-on-dpcspell | GRUSeq2Seq | Exact Match Accuracy: 75.56 |
| dialogue-generation-on-persona-chat-1 | Seq2Seq + Attention | Avg F1: 16.18 |
| machine-translation-on-iwslt2015-german | Bi-GRU (MLE+SLE) | BLEU score: 28.53 |
| machine-translation-on-wmt2014-english-french | RNN-search50* | BLEU score: 36.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.