
摘要
当一个实体名称包含其他名称时,识别所有名称组合可能会变得困难且成本高昂。我们提出了一种新方法,不仅能够识别最外层的命名实体,还能识别内部嵌套的命名实体。为此,我们设计了一个目标函数来训练神经模型,该模型将嵌套实体的标签序列视为其父实体范围内的次优路径。此外,我们还提供了一种解码方法用于推理,该方法以从外到内的方式迭代提取实体,首先从最外层的实体开始,逐步深入到内部实体。我们的方法在条件随机场(Conditional Random Field, CRF)模型基础上没有增加额外的超参数,而CRF模型广泛应用于平面命名实体识别任务。实验结果表明,我们的方法在处理嵌套实体方面优于或至少与现有方法相当,在ACE-2004、ACE-2005和GENIA数据集上分别达到了85.82%、84.34%和77.36%的F1分数。
代码仓库
yahshibu/nested-ner-tacl2020
官方
pytorch
GitHub 中提及
yahshibu/nested-ner-tacl2020-transformers
官方
pytorch
GitHub 中提及
yahshibu/nested-ner-tacl2020-flair
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| named-entity-recognition-on-ace-2004 | Second-best learning and decoding | F1: 85.82 Multi-Task Supervision: n |
| named-entity-recognition-on-ace-2005 | Second-best learning and decoding | F1: 84.34 |
| named-entity-recognition-on-genia | Second-best learning and decoding + BERT + Flair | F1: 77.36 |
| named-entity-recognition-on-genia | Second-best learning and decoding | F1: 77.19 |
| nested-mention-recognition-on-ace-2004 | Second-best learning and decoding | F1: 85.82 |
| nested-mention-recognition-on-ace-2005 | Second-best learning and decoding | F1: 84.34 |
| nested-named-entity-recognition-on-ace-2004 | Second-best learning and decoding + BERT + Flair | F1: 85.82 |
| nested-named-entity-recognition-on-ace-2004 | Second-best learning and decoding + BERT | F1: 84.97 |
| nested-named-entity-recognition-on-ace-2004 | Second-best learning and decoding | F1: 77.44 |
| nested-named-entity-recognition-on-ace-2005 | Second-best learning and decoding + BERT + Flair | F1: 84.34 |
| nested-named-entity-recognition-on-ace-2005 | Second-best learning and decoding | F1: 76.83 |
| nested-named-entity-recognition-on-ace-2005 | Second-best learning and decoding + BERT | F1: 83.99 |
| nested-named-entity-recognition-on-genia | Second-best learning and decoding + BERT | F1: 77.05 |
| nested-named-entity-recognition-on-genia | Second-best learning and decoding | F1: 77.19 |
| nested-named-entity-recognition-on-genia | Second-best learning and decoding + BERT + Flair | F1: 77.36 |
| nested-named-entity-recognition-on-nne | Second-best learning and decoding | Micro F1: 93.19 |