HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Sequence-to-Sequence Learning as Beam-Search Optimization

Sam Wiseman; Alexander M. Rush

Sequence-to-Sequence Learning as Beam-Search Optimization

Abstract

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-purpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores. This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highly-optimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.

Code Repositories

facebookresearch/fairseq
pytorch
Mentioned in GitHub
juliakreutzer/joeynmt
pytorch
Mentioned in GitHub
joeynmt/joeynmt
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
machine-translation-on-iwslt2015-germanWord-level CNN w/attn, input feeding
BLEU score: 24.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Sequence-to-Sequence Learning as Beam-Search Optimization | Papers | HyperAI