Command Palette
Search for a command to run...
Mikel Artetxe; Gorka Labaka; Eneko Agirre

Abstract
While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest. In this paper, we propose an alternative approach based on phrase-based Statistical Machine Translation (SMT) that significantly closes the gap with supervised systems. Our method profits from the modular architecture of SMT: we first induce a phrase table from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. In addition, iterative backtranslation improves results further, yielding, for instance, 14.08 and 26.22 BLEU points in WMT 2014 English-German and English-French, respectively, an improvement of more than 7-10 BLEU points over previous unsupervised systems, and closing the gap with supervised SMT (Moses trained on Europarl) down to 2-5 BLEU points. Our implementation is available at https://github.com/artetxem/monoses
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| machine-translation-on-wmt2014-english-french | SMT + iterative backtranslation (unsupervised) | BLEU score: 26.22 |
| machine-translation-on-wmt2014-english-german | SMT + iterative backtranslation (unsupervised) | BLEU score: 14.08 Hardware Burden: Operations per network pass: |
| machine-translation-on-wmt2014-french-english | SMT + iterative backtranslation (unsupervised) | BLEU score: 25.87 |
| machine-translation-on-wmt2014-german-english | SMT + iterative backtranslation (unsupervised) | BLEU score: 17.43 |
| machine-translation-on-wmt2016-english-german | SMT + iterative backtranslation (unsupervised) | BLEU score: 18.23 |
| machine-translation-on-wmt2016-german-english | SMT + iterative backtranslation (unsupervised) | BLEU score: 23.05 |
| unsupervised-machine-translation-on-wmt2014-1 | SMT | BLEU: 25.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.