3 months ago

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

Qihuang Zhong Liang Ding Yibing Zhan Yu Qiao Yonggang Wen Li Shen Juhua Liu Baosheng Yu Bo Du Yixin Chen

Abstract

This technical report briefly describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard. SuperGLUE is more challenging than the widely used general language understanding evaluation (GLUE) benchmark, containing eight difficult language understanding tasks, including question answering, natural language inference, word sense disambiguation, coreference resolution, and reasoning. [Method] Instead of arbitrarily increasing the size of a pretrained language model (PLM), our aim is to 1) fully extract knowledge from the input pretraining data given a certain parameter budget, e.g., 6B, and 2) effectively transfer this knowledge to downstream tasks. To achieve goal 1), we propose self-evolution learning for PLMs to wisely predict the informative tokens that should be masked, and supervise the masked language modeling (MLM) process with rectified smooth labels. For goal 2), we leverage the prompt transfer technique to improve the low-resource tasks by transferring the knowledge from the foundation model and related downstream tasks to the target task. [Results] According to our submission record (Oct. 2022), with our optimized pretraining and fine-tuning strategies, our 6B Vega method achieved new state-of-the-art performance on 4/8 tasks, sitting atop the SuperGLUE leaderboard on Oct. 8, 2022, with an average score of 91.3.

Benchmarks

Benchmark	Methodology	Metrics
common-sense-reasoning-on-record	Vega v2 6B (fine-tuned)	EM: 93.9 F1: 94.4
common-sense-reasoning-on-record	Turing NLR v5 XXL 5.4B (fine-tuned)	EM: 95.9 F1: 96.4
coreference-resolution-on-winograd-schema	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 97.3
coreference-resolution-on-winograd-schema	Vega v2 6B (KD-based prompt transfer)	Accuracy: 98.6
natural-language-inference-on-commitmentbank	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 97.6 F1: 95.9
natural-language-inference-on-commitmentbank	Vega v2 6B (KD-based prompt transfer)	Accuracy: 99.2 F1: 98.6
natural-language-inference-on-rte	Vega v2 6B (KD-based prompt transfer)	Accuracy: 96%
natural-language-inference-on-rte	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 94.1%
question-answering-on-boolq	Vega v2 6B (fine-tuned)	Accuracy: 90.5
question-answering-on-boolq	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 92
question-answering-on-copa	Vega v2 6B (KD-based prompt transfer)	Accuracy: 99.4
question-answering-on-copa	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 98.2
question-answering-on-multirc	Turing NLR v5 XXL 5.4B (fine-tuned)	EM: 63 F1: 88.4
question-answering-on-multirc	Vega v2 6B (fine-tuned)	EM: 62.4 F1: 88.2
word-sense-disambiguation-on-words-in-context	Vega v2 6B (fine-tuned)	Accuracy: 77.4
word-sense-disambiguation-on-words-in-context	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy: 77.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette