HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

SocialIQA: Commonsense Reasoning about Social Interactions

Maarten Sap; Hannah Rashkin; Derek Chen; Ronan LeBras; Yejin Choi

SocialIQA: Commonsense Reasoning about Social Interactions

Abstract

We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: "Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?" A: "Make sure no one else could hear"). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).

Code Repositories

clear-nus/llm-human-model
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
coreference-resolution-on-winograd-schemaBERT-large 340M
Accuracy: 67
coreference-resolution-on-winograd-schemaBERT-SocialIQA 340M
Accuracy: 72.5
question-answering-on-copaBERT-large 340M
Accuracy: 80.8
question-answering-on-copaBERT-SocialIQA 340M
Accuracy: 83.4
question-answering-on-social-iqaRandom chance baseline
Accuracy: 33.3
question-answering-on-social-iqaBERT-base 110M (fine-tuned)
Accuracy: 63.1
question-answering-on-social-iqaBERT-large 340M (fine-tuned)
Accuracy: 64.5
question-answering-on-social-iqaGPT-1 117M (fine-tuned)
Accuracy: 63

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SocialIQA: Commonsense Reasoning about Social Interactions | Papers | HyperAI