HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features

Kumar Gokul Karthik ; Nandakumar Karthik

Hate-CLIPper: Multimodal Hateful Meme Classification based on
  Cross-modal Interaction of CLIP Features

Abstract

Hateful memes are a growing menace on social media. While the image and itscorresponding text in a meme are related, they do not necessarily convey thesame meaning when viewed individually. Hence, detecting hateful memes requirescareful consideration of both visual and textual information. Multimodalpre-training can be beneficial for this task because it effectively capturesthe relationship between the image and the text by representing them in asimilar feature space. Furthermore, it is essential to model the interactionsbetween the image and text features through intermediate fusion. Most existingmethods either employ multimodal pre-training or intermediate fusion, but notboth. In this work, we propose the Hate-CLIPper architecture, which explicitlymodels the cross-modal interactions between the image and text representationsobtained using Contrastive Language-Image Pre-training (CLIP) encoders via afeature interaction matrix (FIM). A simple classifier based on the FIMrepresentation is able to achieve state-of-the-art performance on the HatefulMemes Challenge (HMC) dataset with an AUROC of 85.8, which even surpasses thehuman performance of 82.65. Experiments on other meme datasets such asPropaganda Memes and TamilMemes also demonstrate the generalizability of theproposed approach. Finally, we analyze the interpretability of the FIMrepresentation and show that cross-modal interactions can indeed facilitate thelearning of meaningful concepts. The code for this work is available athttps://github.com/gokulkarthik/hateclipper.

Code Repositories

gokulkarthik/hateclipper
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
hateful-meme-classification-on-harm-phateclipper
Accuracy: 87.6
F1: 86.9
hateful-meme-classification-on-harmemeHate-CLIPper
AUROC: 91.87
Accuracy: 83.90
hateful-meme-classification-on-pridemmHateCLIPper
Accuracy: 75.5
F1: 74.1
meme-classification-on-hateful-memesHate-CLIPper - Align
ROC-AUC: 0.858
meme-classification-on-multioffHateCLIPper
Accuracy: 62.4
F1: 54.8
meme-classification-on-tamil-memesHate-CLIPper
Micro-F1: 0.59

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features | Papers | HyperAI