HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

A benchmark for toxic comment classification on Civil Comments dataset

Corentin Duchene; Henri Jamet; Pierre Guillaume; Reda Dehak

A benchmark for toxic comment classification on Civil Comments dataset

Abstract

Toxic comment detection on social media has proven to be essential for content moderation. This paper compares a wide set of different models on a highly skewed multi-label hate speech dataset. We consider inference time and several metrics to measure performance and bias in our comparison. We show that all BERTs have similar performance regardless of the size, optimizations or language used to pre-train the models. RNNs are much faster at inference than any of the BERT. BiLSTM remains a good compromise between performance and inference time. RoBERTa with Focal Loss offers the best performance on biases and AUROC. However, DistilBERT combines both good AUROC and a low inference time. All models are affected by the bias of associating identities. BERT, RNN, and XLNet are less sensitive than the CNN and Compact Convolutional Transformers.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
toxic-comment-classification-on-civilBiLSTM
GMB Subgroup: 0.8636
Micro F1: 0.5115
Precision: 0.3572
toxic-comment-classification-on-civilUnfreeze Glove ResNet 44
AUROC: 0.966
GMB BPSN: 0.8493
GMB Subgroup: 0.8421
Macro F1: 0.4648
Micro F1: 0.5958
Precision: 0.4835
Recall: 0.7759
toxic-comment-classification-on-civilCompact Convolutional Transformer (CCT)
AUROC: 0.9526
GMB BNSP: 0.9447
GMB BPSN: 0.8307
GMB Subgroup: 0.8133
Macro F1: 0.3428
Micro F1: 0.4874
Precision: 0.3507
Recall: 0.7983
toxic-comment-classification-on-civilBiGRU
GMB BPSN: 0.8616
toxic-comment-classification-on-civilFreeze Glove ResNet 44
GMB BPSN: 0.7876
GMB Subgroup: 0.8219
Macro F1: 0.4189
Micro F1: 0.5591
Precision: 0.4631
Recall: 0.7053
toxic-comment-classification-on-civilBERTweet
AUROC: 0.979
GMB BNSP: 0.9603
GMB BPSN: 0.8945
GMB Subgroup: 0.878
Macro F1: 0.3612
Micro F1: 0.4928
Precision: 0.3363
Recall: 0.9216
toxic-comment-classification-on-civilXLNet
GMB BNSP: 0.9597
GMB BPSN: 0.8834
GMB Subgroup: 0.8689
Macro F1: 0.3336
Micro F1: 0.4586
Precision: 0.3045
Recall: 0.9254
toxic-comment-classification-on-civilXLM RoBERTa
GMB BPSN: 0.8859
Micro F1: 0.468
Precision: 0.3135
Recall: 0.923
toxic-comment-classification-on-civilDistilBERT
AUROC: 0.9804
GMB BNSP: 0.9644
GMB BPSN: 0.874
GMB Subgroup: 0.8762
Macro F1: 0.3879
Micro F1: 0.5115
Precision: 0.3572
Recall: 0.9001
toxic-comment-classification-on-civilRoBERTa Focal Loss
AUROC: 0.9818
GMB BNSP: 0.9581
GMB BPSN: 0.901
GMB Subgroup: 0.8807
Macro F1: 0.4648
Micro F1: 0.5524
Precision: 0.4017
Recall: 0.8839
toxic-comment-classification-on-civilRoBERTa BCE
AUROC: 0.9813
GMB BNSP: 0.9616
GMB BPSN: 0.8901
GMB Subgroup: 0.88
Macro F1: 0.4749
Micro F1: 0.5359
Precision: 0.3836
Recall: 0.8891
toxic-comment-classification-on-civilUnfreeze Glove ResNet 56
AUROC: 0.9639
GMB BPSN: 0.8445
GMB Subgroup: 0.8487
Macro F1: 0.3778
Recall: 0.8707
toxic-comment-classification-on-civilHateBERT
AUROC: 0.9791
GMB BNSP: 0.9589
GMB BPSN: 0.8915
GMB Subgroup: 0.8744
Macro F1: 0.3679
Micro F1: 0.4844
Precision: 0.3297
Recall: 0.9165
toxic-comment-classification-on-civilAlBERT
AUROC: 0.979
GMB BNSP: 0.9499
GMB BPSN: 0.8982
GMB Subgroup: 0.8734
Macro F1: 0.3541
Micro F1: 0.4845
Precision: 0.3247
Recall: 0.9104

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp