HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

Stephen Mayhew; Terra Blevins; Shuheng Liu; Marek Šuppa; Hila Gonen; Joseph Marvin Imperial; Börje F. Karlsson; Peiqin Lin; Nikola Ljubešić; LJ Miranda; Barbara Plank; Arij Riabi; Yuval Pinter

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

Abstract

We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse languages. In this paper, we detail the dataset creation and composition of UNER; we also provide initial modeling baselines on both in-language and cross-lingual learning settings. We release the data, code, and fitted models to the public.

Code Repositories

opennlg/openba-v2
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
cross-lingual-ner-on-uner-v1-cebuanoUNER XML-R (all)
F1 (micro): 69.6
cross-lingual-ner-on-uner-v1-chineseUNER XML-R (all)
F1 (micro): 88.2
cross-lingual-ner-on-uner-v1-chinese-1UNER XML-R (all)
F1 (micro): 87.7
cross-lingual-ner-on-uner-v1-croatianUNER XML-R (all)
F1 (micro): 90.9
cross-lingual-ner-on-uner-v1-danishUNER XML-R (all)
F1 (micro): 83.0
cross-lingual-ner-on-uner-v1-englishUNER XML-R (all)
F1 (micro): 82.8
cross-lingual-ner-on-uner-v1-portugueseUNER XML-R (all)
F1 (micro): 82.3
cross-lingual-ner-on-uner-v1-pud-chineseUNER XML-R (all)
F1 (micro): 86.0
cross-lingual-ner-on-uner-v1-pud-englishUNER XML-R (all)
F1 (micro): 79.5
cross-lingual-ner-on-uner-v1-pud-germanUNER XML-R (all)
F1 (micro): 78.9
cross-lingual-ner-on-uner-v1-pud-portugueseUNER XML-R (all)
F1 (micro): 85.1
cross-lingual-ner-on-uner-v1-pud-russianUNER XML-R (all)
F1 (micro): 70.6
cross-lingual-ner-on-uner-v1-pud-swedishUNER XML-R (all)
F1 (micro): 85.3
cross-lingual-ner-on-uner-v1-serbianUNER XML-R (all)
F1 (micro): 95.2
cross-lingual-ner-on-uner-v1-slovakUNER XML-R (all)
F1 (micro): 81.6
cross-lingual-ner-on-uner-v1-swedishUNER XML-R (all)
F1 (micro): 88.2
cross-lingual-ner-on-uner-v1-tagalog-tUNER XML-R (all)
F1 (micro): 91.3
cross-lingual-ner-on-uner-v1-tagalog-uUNER XML-R (all)
F1 (micro): 63.8
named-entity-recognition-ner-on-uner-v1UNER XML-R
F1 (micro): 82.70
named-entity-recognition-ner-on-uner-v1-1UNER XML-R
F1 (micro): 86.00
named-entity-recognition-ner-on-uner-v1-2UNER XML-R
F1 (micro): 93.60
named-entity-recognition-ner-on-uner-v1-3UNER XML-R
F1 (micro): 90.4
named-entity-recognition-ner-on-uner-v1-4UNER XML-R
F1 (micro): 85.50
named-entity-recognition-ner-on-uner-v1-5UNER XML-R
F1 (micro): 94.70
named-entity-recognition-ner-on-uner-v1-6UNER XML-R
F1 (micro): 88.30
named-entity-recognition-ner-on-uner-v1-7UNER XML-R
F1 (micro): 89.50
named-entity-recognition-ner-on-uner-v1-8UNER XML-R
F1 (micro): 89.40
named-entity-recognition-ner-on-uner-v1-pudUNER XML-R
F1 (micro): 80.10
named-entity-recognition-ner-on-uner-v1-pud-1UNER XML-R
F1 (micro): 88.80
named-entity-recognition-ner-on-uner-v1-pud-2UNER XML-R
F1 (micro): 82.20
named-entity-recognition-ner-on-uner-v1-pud-3UNER XML-R
F1 (micro): 87.10

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark | Papers | HyperAI