HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Luca Di Liello Siddhant Garg Luca Soldaini Alessandro Moschitti

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Abstract

Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences. Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks. Our code and pre-trained models are released at https://github.com/amazon-research/wqa-multi-sentence-inference .

Code Repositories

amazon-research/wqa-multi-sentence-inference
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
answer-selection-on-asnqRoBERTa-Base Joint MSPP
MAP: 0.673
MRR: 0.737
fact-verification-on-feverRoBERTa-Base Joint MSPP
Accuracy: 74.39
fact-verification-on-feverRoBERTa-Base Joint MSPP Flexible
Accuracy: 75.36
question-answering-on-trecqaRoBERTa-Base Joint + MSPP
MAP: 0.911
MRR: 0.952
question-answering-on-wikiqaRoBERTa-Base Joint MSPP
MAP: 0.887
MRR: 0.900

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Paragraph-based Transformer Pre-training for Multi-Sentence Inference | Papers | HyperAI