3 months ago

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Jiacheng Ye Jiahui Gao Qintong Li Hang Xu Jiangtao Feng Zhiyong Wu Tao Yu Lingpeng Kong

Abstract

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and efficient zero-short learning method, \textsc{ZeroGen}. Given a zero-shot task, we first generate a dataset from scratch using PLMs in an unsupervised manner. Then, we train a tiny task model (e.g., LSTM) under the supervision of the synthesized dataset. This approach allows highly efficient inference as the final task model only has orders of magnitude fewer parameters comparing to PLMs (e.g., GPT2-XL). Apart from being annotation-free and efficient, we argue that \textsc{ZeroGen} can also provide useful insights from the perspective of data-free model-agnostic knowledge distillation, and unreferenced text generation evaluation. Experiments and analysis on different NLP tasks, namely, text classification, question answering, and natural language inference, show the effectiveness of \textsc{ZeroGen}.

Code Repositories

HKUNLP/zerogen

Official

pytorch

Mentioned in GitHub

sumilergao/sungen

pytorch

Mentioned in GitHub

hkunlp/symgen

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
data-free-knowledge-distillation-on-qnli	ZeroGen (T5-base)	Accuracy: 88.5
data-free-knowledge-distillation-on-squad	ZeroGen (T5-base)	Exact Match: 69.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Jiacheng Ye Jiahui Gao Qintong Li Hang Xu Jiangtao Feng Zhiyong Wu Tao Yu Lingpeng Kong

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters