HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Comparative Study of Feature Types for Age-Based Text Classification

Anna Glazkova Yury Egorov Maksim Glazkov

A Comparative Study of Feature Types for Age-Based Text Classification

Abstract

The ability to automatically determine the age audience of a novel provides many opportunities for the development of information retrieval tools. Firstly, developers of book recommendation systems and electronic libraries may be interested in filtering texts by the age of the most likely readers. Further, parents may want to select literature for children. Finally, it will be useful for writers and publishers to determine which features influence whether the texts are suitable for children. In this article, we compare the empirical effectiveness of various types of linguistic features for the task of age-based classification of fiction texts. For this purpose, we collected a text corpus of book previews labeled with one of two categories -- children's or adult. We evaluated the following types of features: readability indices, sentiment, lexical, grammatical and general features, and publishing attributes. The results obtained show that the features describing the text at the document level can significantly increase the quality of machine learning models.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
text-classification-on-rusage-corpus-for-ageLSVC + linguistic features + publishing attributes
F1: 95.77

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Comparative Study of Feature Types for Age-Based Text Classification | Papers | HyperAI