HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang Shuohang Wang Yu Cheng Zhe Gan Ruoxi Jia Bo Li Jingjing Liu

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Abstract

Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks. Our code is available at https://github.com/AI-secure/InfoBERT.

Code Repositories

facebookresearch/anli
pytorch
Mentioned in GitHub
AI-secure/InfoBERT
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
natural-language-inference-on-anli-testInfoBERT (RoBERTa)
A1: 75
A2: 50.5
A3: 47.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective | Papers | HyperAI