
摘要
我们提出了一种提高语言模型结构理解能力的方法。与以往通过任务特定增强来微调模型的方法不同,我们在一组任务无关的语料库上预训练语言模型,以从文本中生成结构。我们的结构预训练使模型所学到的结构知识能够实现零样本迁移。我们在这种方法上研究了其在28个数据集上的性能,这些数据集涵盖了10项结构预测任务,包括开放信息抽取、联合实体和关系抽取、命名实体识别、关系分类、语义角色标注、事件抽取、共指消解、事实探测、意图检测和对话状态跟踪。此外,我们还通过任务特定的训练集进一步增强了预训练过程。实验结果表明,一个参数量为100亿的语言模型能够在大多数任务上实现非平凡的迁移,并在我们评估的28个数据集中有21个达到了当前最佳性能。
代码仓库
cgraywang/deepstruct
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| coreference-resolution-on-conll12 | DeepStruct multi-task | Average F1: 60.6 B3: 57.7 CEAFϕ4: 60.2 MUC: 63.9 |
| coreference-resolution-on-conll12 | DeepStruct multi-task w/ finetune | Average F1: 73.1 B3: 71.3 CEAFϕ4: 73.1 MUC: 74.9 |
| dialogue-state-tracking-on-multiwoz-2-1 | DeepStruct multi-task w/ finetune | Joint Acc: 54.2 |
| dialogue-state-tracking-on-multiwoz-2-1 | DeepStruct multi-task | Joint Acc: 53.5 |
| event-extraction-on-ace2005 | DeepStruct multi-task | Argument Cl: 63.9 Argument Id: 67.5 Trigger Cl: 69.2 Trigger Id: 72.7 |
| event-extraction-on-ace2005 | DeepStruct multi-task w/ finetune | Argument Cl: 56.2 Argument Id: 59.4 Trigger Cl: 69.8 Trigger Id: 73.5 |
| joint-entity-and-relation-extraction-on-2 | DeepStruct multi-task w/ finetune | Entity F1: 90.7 Relation F1: 78.3 |
| joint-entity-and-relation-extraction-on-2 | Deepstruct zero-shot | Entity F1: 48.3 Relation F1: 25.8 |
| joint-entity-and-relation-extraction-on-2 | DeepStruct multi-task | Entity F1: 88.4 Relation F1: 72.8 |
| joint-entity-and-relation-extraction-on-7 | DeepStruct multi-task | Entity F1: 90.2 Relation F1: 58.9 |
| joint-entity-and-relation-extraction-on-7 | DeepStruct multi-task w/ finetune | Entity F1: 90.0 Relation F1: 66.8 |
| joint-entity-and-relation-extraction-on-7 | Deepstruct zero-shot | Entity F1: 31.8 Relation F1: 5.3 |
| joint-entity-and-relation-extraction-on-ade-1 | Deepstruct zero-shot | Entity F1: 60.7 Relation F1: 10.6 |
| joint-entity-and-relation-extraction-on-ade-1 | DeepStruct multi-task | Entity F1: 90.5 Relation F1: 83.6 |
| joint-entity-and-relation-extraction-on-ade-1 | DeepStruct multi-task w/ finetune | Entity F1: 91.1 Relation F1: 83.8 |
| joint-entity-and-relation-extraction-on-nyt | DeepStruct multi-task w/ finetune | Entity F1: 95.9 Relation F1: 93.3 |
| joint-entity-and-relation-extraction-on-nyt | DeepStruct multi-task | Entity F1: 95.4 Relation F1: 93.7 |
| joint-entity-and-relation-extraction-on-nyt | Deepstruct zero-shot | Entity F1: 60.5 Relation F1: 28.6 |
| named-entity-recognition-on-ace2005 | Deepstruct zero-shot | F1: 28.1 |
| named-entity-recognition-on-ace2005 | DeepStruct multi-task w/ finetune | F1: 86.9 |
| named-entity-recognition-on-conll03 | DeepStruct multi-task | F1: 93.1 |
| named-entity-recognition-on-conll03 | Deepstruct zero-shot | F1: 44.4 |
| named-entity-recognition-on-conll03 | DeepStruct multi-task w/ finetune | F1: 93.0 |
| named-entity-recognition-on-genia | DeepStruct multi-task | F1: 80.2 |
| named-entity-recognition-on-genia | DeepStruct multi-task w/ finetune | F1: 80.8 |
| named-entity-recognition-on-genia | Deepstruct zero-shot | F1: 47.2 |
| named-entity-recognition-on-ontonotes | Deepstruct zero-shot | F1: 2.5 |
| named-entity-recognition-on-ontonotes | DeepStruct multi-task | F1: 87.6 |
| named-entity-recognition-on-ontonotes | DeepStruct multi-task w/ finetune | F1: 87.8 |
| open-information-extraction-on-nyt | DeepStruct multi-task | F1: 43.6 |
| open-information-extraction-on-nyt | Deepstruct zero-shot | F1: 28.9 |
| open-information-extraction-on-nyt | DeepStruct multi-task w/ finetune | F1: 45.0 |
| open-information-extraction-on-oie2016 | Deepstruct zero-shot | F1: 28.1 |
| open-information-extraction-on-oie2016 | DeepStruct multi-task w/ finetune | F1: 71.3 |
| open-information-extraction-on-oie2016 | Deepstruct multi-task | F1: 71.2 |
| open-information-extraction-on-penn-treebank | DeepStruct multi-task w/ finetune | F1: 45,1 |
| open-information-extraction-on-penn-treebank | DeepStruct multi-task | F1: 54.5 |
| open-information-extraction-on-penn-treebank | Deepstruct zero-shot | F1: 51 |
| open-information-extraction-on-web | DeepStruct multi-task | F1: 50.8 |
| open-information-extraction-on-web | DeepStruct multi-task w/ finetune | F1: 49.1 |
| open-information-extraction-on-web | Deepstruct zero-shot | F1: 43.8 |
| relation-classification-on-fewrel-1 | Deepstruct zero-shot | F1 (10-way 1-shot): 67.6 F1 (10-way 5-shot): 66.4 F1 (5-way 1-shot): 72.4 F1 (5-way 5-shot: 70.8 |
| relation-classification-on-fewrel-1 | DeepStruct multi-task w/ finetune | F1 (10-way 1-shot): 97.8 F1 (10-way 5-shot): 99.8 F1 (5-way 1-shot): 98.4 F1 (5-way 5-shot: 100 |
| relation-classification-on-fewrel-1 | DeepStruct multi-task | F1 (10-way 1-shot): 92.2 F1 (10-way 5-shot): 94.6 F1 (5-way 1-shot): 93.6 F1 (5-way 5-shot: 96.4 |
| relation-classification-on-tacred-1 | Deepstruct zero-shot | F1: 36.1 |
| relation-classification-on-tacred-1 | DeepStruct multi-task w/ finetune | F1: 76.8 |
| relation-classification-on-tacred-1 | DeepStruct multi-task | F1: 74.9 |
| relation-extraction-on-tacred | DeepStruct multi-task w/ finetune | F1: 76.8 |
| semantic-role-labeling-on-conll05-brown | DeepStruct multi-task w/ finetune | F1: 92.1 |
| semantic-role-labeling-on-conll05-brown | DeepStruct multi-task | F1: 92.0 |
| semantic-role-labeling-on-conll05-wsj | DeepStruct multi-task w/ finetune | F1: 95.2 |
| semantic-role-labeling-on-conll05-wsj | DeepStruct multi-task | F1: 95.5 |
| semantic-role-labeling-on-conll12 | DeepStruct multi-task | F1: 97.2 |
| semantic-role-labeling-on-conll12 | DeepStruct multi-task w/ finetune | F1: 96.0 |