HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model

Milan Straka Jakub Náplava Jana Straková David Samuel

RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model

Abstract

We present RobeCzech, a monolingual RoBERTa language representation model trained on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach. We show that RobeCzech considerably outperforms equally-sized multilingual and Czech-trained contextualized language representation models, surpasses current state of the art in all five evaluated NLP tasks and reaches state-of-the-art results in four of them. The RobeCzech model is released publicly at https://hdl.handle.net/11234/1-3691 and https://huggingface.co/ufal/robeczech-base.

Benchmarks

BenchmarkMethodologyMetrics
semantic-parsing-on-ptg-czech-mrp-2020PERIN + RobeCzech
F1: 92.36

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp