Command Palette
Search for a command to run...
Ankur P. Parikh Xuezhi Wang Sebastian Gehrmann Manaal Faruqui Bhuwan Dhingra Diyi Yang Dipanjan Das

Abstract
We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. To obtain generated targets that are natural but also faithful to the source table, we introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia. We present systematic analyses of our dataset and annotation process as well as results achieved by several state-of-the-art baselines. While usually fluent, existing methods often hallucinate phrases that are not supported by the table, suggesting that this dataset can serve as a useful research benchmark for high-precision conditional text generation.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| data-to-text-generation-on-totto | NCP+CC (Puduppully et al 2019) | BLEU: 19.2 PARENT: 29.2 |
| data-to-text-generation-on-totto | BERT-to-BERT | BLEU: 44 PARENT: 52.6 |
| data-to-text-generation-on-totto | Pointer Generator | BLEU: 41.6 PARENT: 51.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.