HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

Masry Ahmed ; Kavehzadeh Parsa ; Do Xuan Long ; Hoque Enamul ; Joty Shafiq

UniChart: A Universal Vision-language Pretrained Model for Chart
  Comprehension and Reasoning

Abstract

Charts are very popular for analyzing data, visualizing key insights andanswering complex reasoning questions about data. To facilitate chart-baseddata analysis using natural language, several downstream tasks have beenintroduced recently such as chart question answering and chart summarization.However, most of the methods that solve these tasks use pretraining on languageor vision-language tasks that do not attempt to explicitly model the structureof the charts (e.g., how data is visually encoded and how chart elements arerelated to each other). To address this, we first build a large corpus ofcharts covering a wide variety of topics and visual styles. We then presentUniChart, a pretrained model for chart comprehension and reasoning. UniChartencodes the relevant text, data, and visual elements of charts and then uses achart-grounded text decoder to generate the expected output in naturallanguage. We propose several chart-specific pretraining tasks that include: (i)low-level tasks to extract the visual elements (e.g., bars, lines) and datafrom charts, and (ii) high-level tasks to acquire chart understanding andreasoning skills. We find that pretraining the model on a large corpus withchart-specific low- and high-level tasks followed by finetuning on threedown-streaming tasks results in state-of-the-art performance on threedownstream tasks.

Code Repositories

vis-nlp/unichart
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
chart-question-answering-on-chartqaUniChart
1:1 Accuracy: 66.24

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning | Papers | HyperAI