Ilias ChalkidisAbhik JanaDirk HartungMichael BommaritoIon AndroutsopoulosDaniel Martin KatzNikolaos Aletras

摘要
法律及其解释、法律论辩与协议通常以书面形式表达,由此产生了规模庞大的法律文本语料库。随着这些语料库不断扩充,法律文本的分析——作为法律实践的核心环节——日益变得复杂。自然语言理解(Natural Language Understanding, NLU)技术可为法律从业者在这一领域的研究提供有力支持。然而,这些技术的实际效用在很大程度上取决于当前最先进的模型是否能够在法律领域的各类任务中实现良好泛化。为回应这一尚未解决的关键问题,我们提出了法律通用语言理解评估基准(Legal General Language Understanding Evaluation,简称 LexGLUE),这是一个用于标准化评估模型在多种法律NLU任务中表现的多任务数据集集合。此外,我们还对若干通用模型与专为法律领域设计的模型进行了评估与分析,结果表明,后者在多个任务上均表现出持续的性能提升。
代码仓库
coastalcph/lex-glue
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| natural-language-understanding-on-lexglue | CaseLaw-BERT | CaseHOLD: 75.6 ECtHR Task A: 71.2 / 64.2 ECtHR Task B: 88.0 / 77.5 EUR-LEX: 71.0 / 55.9 LEDGAR: 88.0 / 82.3 SCOTUS: 76.4 / 66.2 UNFAIR-ToS: 88.3 / 81.0 |
| natural-language-understanding-on-lexglue | RoBERTa | CaseHOLD: 71.7 ECtHR Task A: 69.5 / 60.7 ECtHR Task B: 87.2 / 77.3 EUR-LEX: 71.8 / 57.5 LEDGAR: 87.9 / 82.1 SCOTUS: 70.8 / 61.2 UNFAIR-ToS: 87.7 / 81.5 |
| natural-language-understanding-on-lexglue | BERT | CaseHOLD: 70.7 ECtHR Task A: 71.4 / 64.0 ECtHR Task B: 87.6 / 77.8 EUR-LEX: 71.6 / 55.6 LEDGAR: 87.7 / 82.2 SCOTUS: 70.5 / 60.9 UNFAIR-ToS: 87.5 / 81.0 |
| natural-language-understanding-on-lexglue | DeBERTa | CaseHOLD: 72.1 ECtHR Task A: 69.1 / 61.2 ECtHR Task B: 87.4 / 77.3 EUR-LEX: 72.3 / 57.2 LEDGAR: 87.9 / 82.0 SCOTUS: 70.0 / 60.0 UNFAIR-ToS: 87.2 / 78.8 |
| natural-language-understanding-on-lexglue | Longformer | CaseHOLD: 72.0 ECtHR Task A: 69.6 / 62.4 ECtHR Task B: 88.0 / 77.8 EUR-LEX: 71.9 / 56.7 LEDGAR: 87.7 / 82.3 SCOTUS: 72.2 / 62.5 UNFAIR-ToS: 87.7 / 80.1 |
| natural-language-understanding-on-lexglue | Legal-BERT | CaseHOLD: 75.1 ECtHR Task A: 71.2 / 64.6 ECtHR Task B: 88.0 / 77.2 EUR-LEX: 72.2 / 56.2 LEDGAR: 88.1 / 82.7 SCOTUS: 76.2 / 65.8 UNFAIR-ToS: 88.6 / 82.3 |
| natural-language-understanding-on-lexglue | BigBird | CaseHOLD: 70.4 ECtHR Task A: 70.5 / 63.8 ECtHR Task B: 88.1 / 76.6 EUR-LEX: 71.8 / 56.6 LEDGAR: 87.7 / 82.1 SCOTUS: 71.7 / 61.4 UNFAIR-ToS: 87.7 / 80.2 |