HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization

Xingxing Zhang; Furu Wei; Ming Zhou

HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization

Abstract

Neural extractive summarization models usually employ a hierarchical encoder for document encoding and they are trained using sentence-level labels, which are created heuristically using rule-based methods. Training the hierarchical encoder with these \emph{inaccurate} labels is challenging. Inspired by the recent work on pre-training transformer sentence encoders \cite{devlin:2018:arxiv}, we propose {\sc Hibert} (as shorthand for {\bf HI}erachical {\bf B}idirectional {\bf E}ncoder {\bf R}epresentations from {\bf T}ransformers) for document encoding and a method to pre-train it using unlabeled data. We apply the pre-trained {\sc Hibert} to our summarization model and it outperforms its randomly initialized counterpart by 1.25 ROUGE on the CNN/Dailymail dataset and by 2.0 ROUGE on a version of New York Times dataset. We also achieve the state-of-the-art performance on these two datasets.

Benchmarks

BenchmarkMethodologyMetrics
extractive-document-summarization-on-cnnHIBERT
ROUGE-1: 42.37
ROUGE-2: 19.95
ROUGE-L: 38.83

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization | Papers | HyperAI