3 months ago

GECToR -- Grammatical Error Correction: Tag, Not Rewrite

Kostiantyn Omelianchuk Vitaliy Atrasevych Artem Chernodub Oleksandr Skurzhanskyi

Abstract

In this paper, we present a simple and efficient GEC sequence tagger using a Transformer encoder. Our system is pre-trained on synthetic data and then fine-tuned in two stages: first on errorful corpora, and second on a combination of errorful and error-free parallel corpora. We design custom token-level transformations to map input tokens to target corrections. Our best single-model/ensemble GEC tagger achieves an $F_{0.5}$ of 65.3/66.5 on CoNLL-2014 (test) and $F_{0.5}$ of 72.4/73.6 on BEA-2019 (test). Its inference speed is up to 10 times as fast as a Transformer-based seq2seq GEC system. The code and trained models are publicly available.

Code Repositories

psawa/gecko-app

pytorch

Mentioned in GitHub

grammarly/gector

Official

pytorch

Mentioned in GitHub

gotutiyan/gector

pytorch

Benchmarks

Benchmark	Methodology	Metrics
grammatical-error-correction-on-bea-2019-test	Sequence tagging + token-level transformations + two-stage fine-tuning (+RoBERTa, XLNet)	F0.5: 73.7
grammatical-error-correction-on-bea-2019-test	Sequence tagging + token-level transformations + two-stage fine-tuning (+XLNet)	F0.5: 72.4
grammatical-error-correction-on-conll-2014	Sequence tagging + token-level transformations + two-stage fine-tuning (+BERT, RoBERTa, XLNet)	F0.5: 66.5 Precision: 78.2 Recall: 41.5
grammatical-error-correction-on-conll-2014	Sequence tagging + token-level transformations + two-stage fine-tuning (+XLNet)	F0.5: 65.3 Precision: 77.5 Recall: 40.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

GECToR -- Grammatical Error Correction: Tag, Not Rewrite

Kostiantyn Omelianchuk Vitaliy Atrasevych Artem Chernodub Oleksandr Skurzhanskyi

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters