
摘要
大多数序列到序列(seq2seq)模型均为自回归模型,其通过基于先前生成的标记来逐个生成每个标记。相比之下,非自回归seq2seq模型能够在单次前向传播中生成全部标记,从而通过GPU等硬件实现并行处理,显著提升效率。然而,直接联合建模所有标记的联合分布极具挑战性,即便采用日益复杂的模型结构,其生成准确率仍显著落后于自回归模型。本文提出一种基于隐变量模型的简单、高效且有效的非自回归序列生成方法。具体而言,我们引入生成流(generative flow)这一优雅的神经网络技术,用于建模复杂分布,并设计了多层流结构,专门用于建模序列隐变量的条件密度。我们在三个神经机器翻译(NMT)基准数据集上对该模型进行了评估,结果表明其性能可与当前最先进的非自回归NMT模型相媲美,且解码时间几乎不随序列长度变化,保持恒定。
代码仓库
XuezheMax/flowseq
官方
pytorch
GitHub 中提及
keonlee9420/VAENAR-TTS
pytorch
基准测试
| 基准 | 方法 | 指标 | 
|---|---|---|
| machine-translation-on-iwslt2015-german | FlowSeq-base | BLEU score: 24.75 | 
| machine-translation-on-wmt2014-english-german | FlowSeq-large (IWD n = 15) | BLEU score: 22.94 Hardware Burden:  Operations per network pass:  | 
| machine-translation-on-wmt2014-english-german | FlowSeq-base | BLEU score: 18.55 Hardware Burden:  Operations per network pass:  | 
| machine-translation-on-wmt2014-english-german | FlowSeq-large (NPD n = 15) | BLEU score: 23.14 Hardware Burden:  Operations per network pass:  | 
| machine-translation-on-wmt2014-english-german | FlowSeq-large (NPD n = 30) | BLEU score: 23.64 Hardware Burden:  Operations per network pass:  | 
| machine-translation-on-wmt2014-english-german | FlowSeq-large | BLEU score: 20.85 Hardware Burden:  Operations per network pass:  | 
| machine-translation-on-wmt2014-german-english | FlowSeq-large (NPD n = 15) | BLEU score: 27.71 | 
| machine-translation-on-wmt2014-german-english | FlowSeq-large | BLEU score: 25.4 | 
| machine-translation-on-wmt2014-german-english | FlowSeq-base | BLEU score: 23.36 | 
| machine-translation-on-wmt2014-german-english | FlowSeq-large (IWD n=15) | BLEU score: 27.16 | 
| machine-translation-on-wmt2014-german-english | FlowSeq-large (NPD n = 30) | BLEU score: 28.29 | 
| machine-translation-on-wmt2016-english-1 | FlowSeq-large (NPD n=15) | BLEU score: 31.97 | 
| machine-translation-on-wmt2016-english-1 | FlowSeq-base | BLEU score: 29.26 | 
| machine-translation-on-wmt2016-english-1 | FlowSeq-large | BLEU score: 29.86 | 
| machine-translation-on-wmt2016-english-1 | FlowSeq-large (NPD n = 30) | BLEU score: 32.35 | 
| machine-translation-on-wmt2016-english-1 | FlowSeq-large (IWD n = 15) | BLEU score: 31.08 | 
| machine-translation-on-wmt2016-romanian | FlowSeq-large (IWD n = 15) | BLEU score: 32.03 | 
| machine-translation-on-wmt2016-romanian | FlowSeq-large (NPD n = 30) | BLEU score: 32.91 | 
| machine-translation-on-wmt2016-romanian | FlowSeq-large (NPD n = 15) | BLEU score: 32.46 | 
| machine-translation-on-wmt2016-romanian | FlowSeq-large | BLEU score: 30.69 | 
| machine-translation-on-wmt2016-romanian | FlowSeq-base | BLEU score: 30.16 |