Command Palette
Search for a command to run...
PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction
Lin Zening ; Wang Jiapeng ; Li Teng ; Liao Wenhui ; Huang Dayi ; Xiong Longfei ; Jin Lianwen

Abstract
Document pair extraction aims to identify key and value entities as well astheir relationships from visually-rich documents. Most existing methods divideit into two separate tasks: semantic entity recognition (SER) and relationextraction (RE). However, simply concatenating SER and RE serially can lead tosevere error propagation, and it fails to handle cases like multi-line entitiesin real scenarios. To address these issues, this paper introduces a novelframework, PEneo (Pair Extraction new decoder option), which performs documentpair extraction in a unified pipeline, incorporating three concurrentsub-tasks: line extraction, line grouping, and entity linking. This approachalleviates the error accumulation problem and can handle the case of multi-lineentities. Furthermore, to better evaluate the model's performance and tofacilitate future research on pair extraction, we introduce RFUND, are-annotated version of the commonly used FUNSD and XFUND datasets, to makethem more accurate and cover realistic situations. Experiments on variousbenchmarks demonstrate PEneo's superiority over previous pipelines, boostingthe performance by a large margin (e.g., 19.89%-22.91% F1 score on RFUND-EN)when combined with various backbones like LiLT and LayoutLMv3, showing itseffectiveness and generality. Codes and the new annotations are available athttps://github.com/ZeningLin/PEneo.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| key-value-pair-extraction-on-rfund-en | PEneo (LayoutLMv2_base) | key-value pair F1: 71.97 |
| key-value-pair-extraction-on-rfund-en | PEneo (LayoutLMv3_base) | key-value pair F1: 79.27 |
| key-value-pair-extraction-on-rfund-en | PEneo (LiLT[EN-R]_base) | key-value pair F1: 74.22 |
| key-value-pair-extraction-on-rfund-en | PEneo (LiLT[InfoXLM]_base) | key-value pair F1: 74.29 |
| key-value-pair-extraction-on-rfund-en | PEneo (LayoutXLM_base) | key-value pair F1: 74.25 |
| key-value-pair-extraction-on-sibr | PEneo (LiLT[InfoXLM]_base) | key-value pair F1: 82.36 |
| key-value-pair-extraction-on-sibr | PEneo (LayoutLMv3_base_chinese) | key-value pair F1: 82.52 |
| key-value-pair-extraction-on-sibr | PEneo (LayoutXLM_base) | key-value pair F1: 82.23 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.