HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Momentum Calibration for Text Generation

Xingxing Zhang Yiran Liu Xun Wang Pengcheng He Yang Yu Si-Qing Chen Wayne Xiong Furu Wei

Momentum Calibration for Text Generation

Abstract

The input and output of most text generation tasks can be transformed to two sequences of tokens and they can be modeled using sequence-to-sequence learning modeling tools such as Transformers. These models are usually trained by maximizing the likelihood the output text sequence and assumes the input sequence and all gold preceding tokens are given during training, while during inference the model suffers from the exposure bias problem (i.e., it only has access to its previously predicted tokens rather gold tokens during beam search). In this paper, we propose MoCa ({\bf Mo}mentum {\bf Ca}libration) for text generation. MoCa is an online method that dynamically generates slowly evolving (but consistent) samples using a momentum moving average generator with beam search and MoCa learns to align its model scores of these samples with their actual qualities. Experiments on four text generation datasets (i.e., CNN/DailyMail, XSum, SAMSum and Gigaword) show MoCa consistently improves strong pre-trained transformers using vanilla fine-tuning and we achieve the state-of-the-art results on CNN/DailyMail and SAMSum datasets.

Benchmarks

BenchmarkMethodologyMetrics
text-summarization-on-samsum-corpusMoCa
ROUGE-1: 55.13
ROUGE-2: 30.57
ROUGE-L: 50.88

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Momentum Calibration for Text Generation | Papers | HyperAI