HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian

{Mikhail Gronas Anna Rumshisky Anna Rogers Alex Gribov Alexey Romanov Svitlana Volkova}

RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian

Abstract

This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages. RuSentiment is currently the largest in its class for Russian, with 31,185 posts annotated with Fleiss{'} kappa of 0.58 (3 annotations per post). To diversify the dataset, 6,950 posts were pre-selected with an active learning-style strategy. We report baseline classification results, and we also release the best-performing embeddings trained on 3.2B tokens of Russian VKontakte posts.

Benchmarks

BenchmarkMethodologyMetrics
sentiment-analysis-on-rusentimentNNC+VK
Weighted F1: 72.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian | Papers | HyperAI