3 months ago

Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}

{Kentaro Inui Benjamin Heinzerling Ana Brassard Pride Kavumba}

Abstract

Explanation prompts ask language models to not only assign a particular label to a giveninput, such as true, entailment, or contradiction in the case of natural language inference but also to generate a free-text explanation that supports this label. For example: “This is label because explanation.” While this type of prompt was originally introduced with the aim of improving model interpretability, we showhere that explanation prompts also improve robustness to adversarial perturbations in naturallanguage inference benchmarks. Compared to prompting for labels only, explanation prompting consistently yields stronger performance on adversarial benchmarks, outperforming the state of the art on Adversarial Natural Language Inference, Counterfactually-Augmented Natural Language Inference, and SNLI-Hard datasets. We argue that the increase in robustness is due to the fact that prompting for explanations weakens superficial cues. Specifically, single tokens that are highly predictive of the correct answer in the label-only setting become uninformative when the model also has to generate explanations.

Benchmarks

Benchmark	Methodology	Metrics
natural-language-inference-on-anli-test	T0-11B (explanation prompting)	A1: 75.6 A2: 60.6 A3: 59.9
natural-language-inference-on-anli-test	T5-3B (explanation prompting)	A1: 81.8 A2: 72.5 A3: 74.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning