HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Learning from Noisy Labels for Entity-Centric Information Extraction

Wenxuan Zhou Muhao Chen

Learning from Noisy Labels for Entity-Centric Information Extraction

Abstract

Recent information extraction approaches have relied on training deep neural models. However, such models can easily overfit noisy labels and suffer from performance degradation. While it is very costly to filter noisy labels in large learning resources, recent studies show that such labels take more training steps to be memorized and are more frequently forgotten than clean labels, therefore are identifiable in training. Motivated by such properties, we propose a simple co-regularization framework for entity-centric information extraction, which consists of several neural models with identical structures but different parameter initialization. These models are jointly optimized with the task-specific losses and are regularized to generate similar predictions based on an agreement loss, which prevents overfitting on noisy labels. Extensive experiments on two widely used but noisy benchmarks for information extraction, TACRED and CoNLL03, demonstrate the effectiveness of our framework. We release our code to the community for future research.

Code Repositories

wzhouad/NLL-IE
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
named-entity-recognition-ner-on-conll-2003Co-regularized LUKE
F1: 94.22
named-entity-recognition-on-conllNoise-robust Co-regularization + LUKE
F1: 95.60
named-entity-recognition-on-conllNoise-robust Co-regularization + BERT-large
F1: 94.04
relation-extraction-on-tacredNoise-robust Co-regularization + BERT-large
F1: 73.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Learning from Noisy Labels for Entity-Centric Information Extraction | Papers | HyperAI