4 months ago

Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

Loïc Vial; Benjamin Lecouteux; Didier Schwab

Abstract

In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database. We propose two different methods that greatly reduces the size of neural WSD models, with the benefit of improving their coverage without additional training data, and without impacting their precision. In addition to our method, we present a WSD system which relies on pre-trained BERT word vectors in order to achieve results that significantly outperform the state of the art on all WSD evaluation tasks.

Code Repositories

getalp/disambiguate

Official

pytorch

Mentioned in GitHub

Gozzo18/WSD-Final-Homework---NLP

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
word-sense-disambiguation-on-semeval-2007	SemCor+WNGC, hypernyms	F1: 73.4
word-sense-disambiguation-on-semeval-2007-1	SemCor+WNGC, hypernyms	F1: 90.4
word-sense-disambiguation-on-semeval-2013	SemCor+WNGC, hypernyms	F1: 78.7
word-sense-disambiguation-on-semeval-2015	SemCor+WNGC, hypernyms	F1: 82.6
word-sense-disambiguation-on-senseval-2	SemCor+WNGC, hypernyms	F1: 79.7
word-sense-disambiguation-on-senseval-3-task	SemCor+WNGC, hypernyms	F1: 77.8
word-sense-disambiguation-on-supervised	SemCor+WNGC, hypernyms	SemEval 2007: 73.4 SemEval 2013: 78.7 SemEval 2015: 82.6 Senseval 2: 79.7 Senseval 3: 77.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette