HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

DeepStruct: Pretraining of Language Models for Structure Prediction

Chenguang Wang; Xiao Liu; Zui Chen; Haoyun Hong; Jie Tang; Dawn Song

DeepStruct: Pretraining of Language Models for Structure Prediction

Abstract

We introduce a method for improving the structural understanding abilities of language models. Unlike previous approaches that finetune the models with task-specific augmentation, we pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We study the performance of this approach on 28 datasets, spanning 10 structure prediction tasks including open information extraction, joint entity and relation extraction, named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, factual probe, intent detection, and dialogue state tracking. We further enhance the pretraining with the task-specific training sets. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.

Code Repositories

cgraywang/deepstruct
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
coreference-resolution-on-conll12DeepStruct multi-task
Average F1: 60.6
B3: 57.7
CEAFϕ4: 60.2
MUC: 63.9
coreference-resolution-on-conll12DeepStruct multi-task w/ finetune
Average F1: 73.1
B3: 71.3
CEAFϕ4: 73.1
MUC: 74.9
dialogue-state-tracking-on-multiwoz-2-1DeepStruct multi-task w/ finetune
Joint Acc: 54.2
dialogue-state-tracking-on-multiwoz-2-1DeepStruct multi-task
Joint Acc: 53.5
event-extraction-on-ace2005DeepStruct multi-task
Argument Cl: 63.9
Argument Id: 67.5
Trigger Cl: 69.2
Trigger Id: 72.7
event-extraction-on-ace2005DeepStruct multi-task w/ finetune
Argument Cl: 56.2
Argument Id: 59.4
Trigger Cl: 69.8
Trigger Id: 73.5
joint-entity-and-relation-extraction-on-2DeepStruct multi-task w/ finetune
Entity F1: 90.7
Relation F1: 78.3
joint-entity-and-relation-extraction-on-2Deepstruct zero-shot
Entity F1: 48.3
Relation F1: 25.8
joint-entity-and-relation-extraction-on-2DeepStruct multi-task
Entity F1: 88.4
Relation F1: 72.8
joint-entity-and-relation-extraction-on-7DeepStruct multi-task
Entity F1: 90.2
Relation F1: 58.9
joint-entity-and-relation-extraction-on-7DeepStruct multi-task w/ finetune
Entity F1: 90.0
Relation F1: 66.8
joint-entity-and-relation-extraction-on-7Deepstruct zero-shot
Entity F1: 31.8
Relation F1: 5.3
joint-entity-and-relation-extraction-on-ade-1Deepstruct zero-shot
Entity F1: 60.7
Relation F1: 10.6
joint-entity-and-relation-extraction-on-ade-1DeepStruct multi-task
Entity F1: 90.5
Relation F1: 83.6
joint-entity-and-relation-extraction-on-ade-1DeepStruct multi-task w/ finetune
Entity F1: 91.1
Relation F1: 83.8
joint-entity-and-relation-extraction-on-nytDeepStruct multi-task w/ finetune
Entity F1: 95.9
Relation F1: 93.3
joint-entity-and-relation-extraction-on-nytDeepStruct multi-task
Entity F1: 95.4
Relation F1: 93.7
joint-entity-and-relation-extraction-on-nytDeepstruct zero-shot
Entity F1: 60.5
Relation F1: 28.6
named-entity-recognition-on-ace2005Deepstruct zero-shot
F1: 28.1
named-entity-recognition-on-ace2005DeepStruct multi-task w/ finetune
F1: 86.9
named-entity-recognition-on-conll03DeepStruct multi-task
F1: 93.1
named-entity-recognition-on-conll03Deepstruct zero-shot
F1: 44.4
named-entity-recognition-on-conll03DeepStruct multi-task w/ finetune
F1: 93.0
named-entity-recognition-on-geniaDeepStruct multi-task
F1: 80.2
named-entity-recognition-on-geniaDeepStruct multi-task w/ finetune
F1: 80.8
named-entity-recognition-on-geniaDeepstruct zero-shot
F1: 47.2
named-entity-recognition-on-ontonotesDeepstruct zero-shot
F1: 2.5
named-entity-recognition-on-ontonotesDeepStruct multi-task
F1: 87.6
named-entity-recognition-on-ontonotesDeepStruct multi-task w/ finetune
F1: 87.8
open-information-extraction-on-nytDeepStruct multi-task
F1: 43.6
open-information-extraction-on-nytDeepstruct zero-shot
F1: 28.9
open-information-extraction-on-nytDeepStruct multi-task w/ finetune
F1: 45.0
open-information-extraction-on-oie2016Deepstruct zero-shot
F1: 28.1
open-information-extraction-on-oie2016DeepStruct multi-task w/ finetune
F1: 71.3
open-information-extraction-on-oie2016Deepstruct multi-task
F1: 71.2
open-information-extraction-on-penn-treebankDeepStruct multi-task w/ finetune
F1: 45,1
open-information-extraction-on-penn-treebankDeepStruct multi-task
F1: 54.5
open-information-extraction-on-penn-treebankDeepstruct zero-shot
F1: 51
open-information-extraction-on-webDeepStruct multi-task
F1: 50.8
open-information-extraction-on-webDeepStruct multi-task w/ finetune
F1: 49.1
open-information-extraction-on-webDeepstruct zero-shot
F1: 43.8
relation-classification-on-fewrel-1Deepstruct zero-shot
F1 (10-way 1-shot): 67.6
F1 (10-way 5-shot): 66.4
F1 (5-way 1-shot): 72.4
F1 (5-way 5-shot: 70.8
relation-classification-on-fewrel-1DeepStruct multi-task w/ finetune
F1 (10-way 1-shot): 97.8
F1 (10-way 5-shot): 99.8
F1 (5-way 1-shot): 98.4
F1 (5-way 5-shot: 100
relation-classification-on-fewrel-1DeepStruct multi-task
F1 (10-way 1-shot): 92.2
F1 (10-way 5-shot): 94.6
F1 (5-way 1-shot): 93.6
F1 (5-way 5-shot: 96.4
relation-classification-on-tacred-1Deepstruct zero-shot
F1: 36.1
relation-classification-on-tacred-1DeepStruct multi-task w/ finetune
F1: 76.8
relation-classification-on-tacred-1DeepStruct multi-task
F1: 74.9
relation-extraction-on-tacredDeepStruct multi-task w/ finetune
F1: 76.8
semantic-role-labeling-on-conll05-brownDeepStruct multi-task w/ finetune
F1: 92.1
semantic-role-labeling-on-conll05-brownDeepStruct multi-task
F1: 92.0
semantic-role-labeling-on-conll05-wsjDeepStruct multi-task w/ finetune
F1: 95.2
semantic-role-labeling-on-conll05-wsjDeepStruct multi-task
F1: 95.5
semantic-role-labeling-on-conll12DeepStruct multi-task
F1: 97.2
semantic-role-labeling-on-conll12DeepStruct multi-task w/ finetune
F1: 96.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
DeepStruct: Pretraining of Language Models for Structure Prediction | Papers | HyperAI