HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Grounded Situation Recognition

Pratt Sarah ; Yatskar Mark ; Weihs Luca ; Farhadi Ali ; Kembhavi Aniruddha

Grounded Situation Recognition

Abstract

We introduce Grounded Situation Recognition (GSR), a task that requiresproducing structured semantic summaries of images describing: the primaryactivity, entities engaged in the activity with their roles (e.g. agent, tool),and bounding-box groundings of entities. GSR presents important technicalchallenges: identifying semantic saliency, categorizing and localizing a largeand diverse set of entities, overcoming semantic sparsity, and disambiguatingroles. Moreover, unlike in captioning, GSR is straightforward to evaluate. Tostudy this new task we create the Situations With Groundings (SWiG) datasetwhich adds 278,336 bounding-box groundings to the 11,538 entity classes in theimsitu dataset. We propose a Joint Situation Localizer and find that jointlypredicting situations and groundings with end-to-end training handilyoutperforms independent training on the entire grounding metric suite withrelative gains between 8% and 32%. Finally, we show initial findings on threeexciting future directions enabled by our models: conditional querying, visualchaining, and grounded semantic aware image retrieval. Code and data availableat https://prior.allenai.org/projects/gsr.

Code Repositories

allenai/swig
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
grounded-situation-recognition-on-swigJSL
Top-1 Verb: 39.94
Top-1 Verb u0026 Grounded-Value: 24.86
Top-1 Verb u0026 Value: 31.44
Top-5 Verbs: 67.6
Top-5 Verbs u0026 Grounded-Value: 40.6
Top-5 Verbs u0026 Value: 51.88
grounded-situation-recognition-on-swigISL
Top-1 Verb: 39.36
Top-1 Verb u0026 Grounded-Value: 22.73
Top-1 Verb u0026 Value: 30.09
Top-5 Verbs: 65.51
Top-5 Verbs u0026 Grounded-Value: 36.6
Top-5 Verbs u0026 Value: 50.16
situation-recognition-on-imsituJSL
Top-1 Verb: 39.94
Top-1 Verb u0026 Value: 31.44
Top-5 Verbs: 67.6
Top-5 Verbs u0026 Value: 51.88
situation-recognition-on-imsituISL
Top-1 Verb: 39.36
Top-1 Verb u0026 Value: 30.09
Top-5 Verbs: 65.51
Top-5 Verbs u0026 Value: 50.16

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Grounded Situation Recognition | Papers | HyperAI