HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Grounded Situation Recognition with Transformers

Cho Junhyeong ; Yoon Youngseok ; Lee Hyeonjun ; Kwak Suha

Grounded Situation Recognition with Transformers

Abstract

Grounded Situation Recognition (GSR) is the task that not only classifies asalient action (verb), but also predicts entities (nouns) associated withsemantic roles and their locations in the given image. Inspired by theremarkable success of Transformers in vision tasks, we propose a GSR modelbased on a Transformer encoder-decoder architecture. The attention mechanism ofour model enables accurate verb classification by capturing high-level semanticfeature of an image effectively, and allows the model to flexibly deal with thecomplicated and image-dependent relations between entities for improved nounclassification and localization. Our model is the first Transformerarchitecture for GSR, and achieves the state of the art in every evaluationmetric on the SWiG benchmark. Our code is available athttps://github.com/jhcho99/gsrtr .

Code Repositories

jhcho99/gsrtr
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
grounded-situation-recognition-on-swigGSRTR
Top-1 Verb: 40.63
Top-1 Verb u0026 Grounded-Value: 25.49
Top-1 Verb u0026 Value: 32.15
Top-5 Verbs: 69.81
Top-5 Verbs u0026 Grounded-Value: 42.5
Top-5 Verbs u0026 Value: 54.13
situation-recognition-on-imsituGSRTR
Top-1 Verb: 40.63
Top-1 Verb u0026 Value: 32.15
Top-5 Verbs: 69.81
Top-5 Verbs u0026 Value: 54.13

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Grounded Situation Recognition with Transformers | Papers | HyperAI