HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization

Wanhua Li Zibin Meng Jiawei Zhou Donglai Wei Chuang Gan Hanspeter Pfister

SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy
  Segment Optimization

Abstract

Social relation reasoning aims to identify relation categories such asfriends, spouses, and colleagues from images. While current methods adopt theparadigm of training a dedicated network end-to-end using labeled image data,they are limited in terms of generalizability and interpretability. To addressthese issues, we first present a simple yet well-crafted framework named{ame}, which combines the perception capability of Vision Foundation Models(VFMs) and the reasoning capability of Large Language Models (LLMs) within amodular framework, providing a strong baseline for social relation recognition.Specifically, we instruct VFMs to translate image content into a textual socialstory, and then utilize LLMs for text-based reasoning. {ame} introducessystematic design principles to adapt VFMs and LLMs separately and bridge theirgaps. Without additional model training, it achieves competitive zero-shotresults on two databases while offering interpretable answers, as LLMs cangenerate language-based explanations for the decisions. The manual promptdesign process for LLMs at the reasoning phase is tedious and an automatedprompt optimization method is desired. As we essentially convert a visualclassification task into a generative task of LLMs, automatic promptoptimization encounters a unique long prompt optimization issue. To addressthis issue, we further propose the Greedy Segment Prompt Optimization (GSPO),which performs a greedy search by utilizing gradient information at the segmentlevel. Experimental results show that GSPO significantly improves performance,and our method also generalizes to different image styles. The code isavailable at https://github.com/Mengzibin/SocialGPT.

Code Repositories

mengzibin/socialgpt
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
visual-social-relationship-recognition-on-1SocialGPT
Accuracy: 66.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization | Papers | HyperAI