HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

KnowZRel: Common Sense Knowledge-based Zero-Shot Relationship Retrieval for Generalised Scene Graph Generation

{Edward Curry John G. Breslin M. Jaleed Khan}

Abstract

A scene graph is a key image representation in visual reasoning. The generalisability of Scene Graph Generation (SGG) methods is crucial for reliable reasoning and real-world applicability. However, imbalanced training datasets limit this, underrepresenting meaningful visual relationships. Current SGG methods using external knowledge sources face limitations due to these imbalances or restricted relationship coverage, impacting their reasoning and generalisation capabilities. We propose a novel neurosymbolic approach that integrates data-driven object detection with heterogeneous knowledge graph-based object refinement and zero-shot relationship retrieval, highlighting the loosely coupled synergy between neural and symbolic components. This combination addresses the limitations of imbalanced training datasets in scene graph generation and enables effective prediction of unseen visual relationships. Objects are detected using a region-based deep neural network and refined based on their positional and structural similarity, followed by retrieval of pairwise visual relationships using a heterogeneous knowledge graph. The redundant and irrelevant visual relationships are discarded based on the similarity of relationship labels and node embeddings. Finally, the visual relationships are interlinked to generate the scene graph. The employed heterogeneous knowledge graph combines diverse knowledge sources, offering rich common sense knowledge about objects and their interactions in the world. Our method, evaluated using the benchmark Visual Genome dataset and zero-shot recall (zR@K) metric, shows a 59.96% improvement over existing state-of-the-art methods, highlighting its effectiveness in generalised SGG. The object refinement step effectively improved the object detection performance by 57.1%. Additional evaluation using the GQA dataset confirms the cross-dataset generalisability of our method. We also compared various knowledge sources and embedding models to determine an optimal combination for zero-shot SGG. The source code is available at https://github.com/jaleedkhan/zsrr-sgg.

Benchmarks

BenchmarkMethodologyMetrics
object-detection-on-gqaKnowZRel
mAP: 39
object-detection-on-visual-genomeKnowZRel
MAP: 44
scene-graph-generation-on-gqaKnowZRel
zR@100: 29.56
zR@20: 12.47
zR@50: 22.51
scene-graph-generation-on-visual-genomeKnowZRel
zR@100: 35.65
zR@20: 14.22
zR@50: 25.43

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
KnowZRel: Common Sense Knowledge-based Zero-Shot Relationship Retrieval for Generalised Scene Graph Generation | Papers | HyperAI