HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

BERT Embeddings for Automatic Readability Assessment

Joseph Marvin Imperial

BERT Embeddings for Automatic Readability Assessment

Abstract

Automatic readability assessment (ARA) is the task of evaluating the level of ease or difficulty of text documents for a target audience. For researchers, one of the many open problems in the field is to make such models trained for the task show efficacy even for low-resource languages. In this study, we propose an alternative way of utilizing the information-rich embeddings of BERT models with handcrafted linguistic features through a combined method for readability assessment. Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets, obtaining as high as 12.4% increase in F1 performance. We also show that the general information encoded in BERT embeddings can be used as a substitute feature set for low-resource languages like Filipino with limited semantic and syntactic NLP tools to explicitly extract feature values for the task.

Code Repositories

imperialite/BERT-Embeddings-For-ARA
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-classification-on-onestopenglishLogistic Regression
Accuracy (5-fold): 0.744

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
BERT Embeddings for Automatic Readability Assessment | Papers | HyperAI