HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SciCap: Generating Captions for Scientific Figures

Ting-Yao Hsu C. Lee Giles Ting-Hao &#39 Kenneth&#39 Huang

SciCap: Generating Captions for Scientific Figures

Abstract

Researchers use figures to communicate rich, complex information in scientific papers. The captions of these figures are critical to conveying effective messages. However, low-quality figure captions commonly occur in scientific articles and may decrease understanding. In this paper, we propose an end-to-end neural framework to automatically generate informative, high-quality captions for scientific figures. To this end, we introduce SCICAP, a large-scale figure-caption dataset based on computer science arXiv papers published between 2010 and 2020. After pre-processing - including figure-type classification, sub-figure identification, text normalization, and caption text selection - SCICAP contained more than two million figures extracted from over 290,000 papers. We then established baseline models that caption graph plots, the dominant (19.2%) figure type. The experimental results showed both opportunities and steep challenges of generating captions for scientific figures.

Code Repositories

tingyaohsu/scicap
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-captioning-on-scicapCNN+LSTM (Text only, Caption w/ <=100 words)
BLEU-4: 0.0165
image-captioning-on-scicapCNN+LSTM (Vision + Text, Caption w/ <=100 words)
BLEU-4: 0.0168
image-captioning-on-scicapCNN+LSTM (Vision only, Single-Sent Caption)
BLEU-4: 0.0207
image-captioning-on-scicapCNN+LSTM (Text only, Single-Sent Caption)
BLEU-4: 0.0212
image-captioning-on-scicapCNN+LSTM (Vision + Text, First sentence)
BLEU-4: 0.0205
image-captioning-on-scicapCNN+LSTM (Text only, First sentence)
BLEU-4: 0.0213
image-captioning-on-scicapCNN+LSTM (Vision only, First sentence)
BLEU-4: 0.0219
image-captioning-on-scicapCNN+LSTM (Vision only, Caption w/ <=100 words)
BLEU-4: 0.0172
image-captioning-on-scicapCNN+LSTM (Vision + Text, Single-Sent Caption)
BLEU-4: 0.0202

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp