
摘要
我们提出一种自顶向下的篇章分析方法,其概念上比先前的方法(Kobayashi 等,2020;Zhang 等,2020)更为简洁。通过将该任务建模为序列标注问题,旨在迭代地将文档分割为独立的篇章单元,我们得以省去解码器模块,并显著缩小分割点的搜索空间。我们在该任务中同时探索了传统的循环神经网络模型与现代预训练的 Transformer 模型,并进一步提出了一种新颖的动态最优标签(dynamic oracle)机制,用于支持自顶向下的解析过程。基于 Full 指标,我们提出的 LSTM 模型在 RST(语篇结构树)解析任务上达到了新的最先进水平。
代码仓库
fajri91/NeuralRST-TopDown
官方
pytorch
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| discourse-parsing-on-rst-dt | LSTM Dynamic | Standard Parseval (Full): 50.3 Standard Parseval (Nuclearity): 62.3 Standard Parseval (Relation): 51.5 Standard Parseval (Span): 73.1 |
| discourse-parsing-on-rst-dt | Transformer (dynamic) | Standard Parseval (Full): 49.2 Standard Parseval (Nuclearity): 60.1 Standard Parseval (Span): 70.2 |
| discourse-parsing-on-rst-dt | Transformer (static) | Standard Parseval (Full): 49.0 Standard Parseval (Nuclearity): 59.9 Standard Parseval (Relation): 50.6 Standard Parseval (Span): 70.6 |
| discourse-parsing-on-rst-dt | LSTM Static | Standard Parseval (Full): 49.4 Standard Parseval (Nuclearity): 61.7 Standard Parseval (Relation): 50.5 Standard Parseval (Span): 72.7 |