HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond

Francesco Barbieri Luis Espinosa Anke Jose Camacho-Collados

XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond

Abstract

Language models are ubiquitous in current NLP, and their multilingual capacity has recently attracted considerable attention. However, current analyses have almost exclusively focused on (multilingual variants of) standard benchmarks, and have relied on clean pre-training and task-specific corpora as multilingual signals. In this paper, we introduce XLM-T, a model to train and evaluate multilingual language models in Twitter. In this paper we provide: (1) a new strong multilingual baseline consisting of an XLM-R (Conneau et al. 2020) model pre-trained on millions of tweets in over thirty languages, alongside starter code to subsequently fine-tune on a target task; and (2) a set of unified sentiment analysis Twitter datasets in eight different languages and a XLM-T model fine-tuned on them.

Code Repositories

cardiffnlp/xlm-t
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
sentiment-analysis-on-tweetevalRoB-RT
ALL: 65.2
Emoji: 31.4
Emotion: 79.5
Hate: 52.3
Irony: 61.7
Offensive: 80.5
Sentiment: 72.6
Stance: 69.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp