HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition

Hongje Seong; Junhyuk Hyun; Euntai Kim

FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition

Abstract

Scene recognition is an image recognition problem aimed at predicting the category of the place at which the image is taken. In this paper, a new scene recognition method using the convolutional neural network (CNN) is proposed. The proposed method is based on the fusion of the object and the scene information in the given image and the CNN framework is named as FOS (fusion of object and scene) Net. In addition, a new loss named scene coherence loss (SCL) is developed to train the FOSNet and to improve the scene recognition performance. The proposed SCL is based on the unique traits of the scene that the 'sceneness' spreads and the scene class does not change all over the image. The proposed FOSNet was experimented with three most popular scene recognition datasets, and their state-of-the-art performance is obtained in two sets: 60.14% on Places 2 and 90.37% on MIT indoor 67. The second highest performance of 77.28% is obtained on SUN 397.

Benchmarks

BenchmarkMethodologyMetrics
scene-recognition-on-mit-indoors-scenesFOSNet
Accuracy: 90.3
scene-recognition-on-places365FOSNet
Top 1 Accuracy: 60.14
Top 5 Accuracy: 88.86
scene-recognition-on-sun397FOSNet
Accuracy: 77.28

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition | Papers | HyperAI