HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

Michihiro Yasunaga; Jungo Kasai; Dragomir Radev

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

Abstract

Adversarial training (AT) is a powerful regularization method for neural networks, aiming to achieve robustness to input perturbations. Yet, the specific effects of the robustness obtained from AT are still unclear in the context of natural language processing. In this paper, we propose and analyze a neural POS tagging model that exploits AT. In our experiments on the Penn Treebank WSJ corpus and the Universal Dependencies (UD) dataset (27 languages), we find that AT not only improves the overall tagging accuracy, but also 1) prevents over-fitting well in low resource languages and 2) boosts tagging accuracy for rare / unseen words. We also demonstrate that 3) the improved tagging performance by AT contributes to the downstream task of dependency parsing, and that 4) AT helps the model to learn cleaner word representations. 5) The proposed AT model is generally effective in different sequence labeling tasks. These positive results motivate further use of AT for natural language tasks.

Code Repositories

michiyasunaga/pos_adv
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
chunking-on-conll-2000BiLSTM-CRF
Exact Span F1: 95.18
chunking-on-conll-2000Adversarial Training
Exact Span F1: 95.25
named-entity-recognition-ner-on-conll-2003Adversarial Bi-LSTM
F1: 91.56
part-of-speech-tagging-on-penn-treebankAdversarial Bi-LSTM
Accuracy: 97.59
part-of-speech-tagging-on-udAdversarial Bi-LSTM
Avg accuracy: 96.65

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Robust Multilingual Part-of-Speech Tagging via Adversarial Training | Papers | HyperAI