FigureQA Visual Reasoning Dataset

Updated 29 October 2025

FigureQA is a synthetic visual reasoning corpus comprised of annotated scientific figures designed to test comprehension of data visualizations.
It contains over one million QA pairs from more than 100,000 images including bar graphs, line plots, and pie charts using binary yes/no questions.
Baseline models like CNN-LSTM and Relational Networks show a notable performance gap compared to human accuracy, highlighting challenges in statistical and relational reasoning.

The FigureQA dataset is an annotated visual reasoning corpus designed to evaluate a machine's ability to comprehend and reason about data visualizations. It focuses on synthetic, scientific-style figures such as line plots, bar graphs, and pie charts, structured around binary question-answer (QA) tasks. This dataset introduces a significant challenge for machine learning systems by requiring them to not only recognize elements within figures but also deduce relationships and properties like maxima, minima, and intersections through visual reasoning.

1. Dataset Composition and Structure

FigureQA comprises over one million QA pairs grounded in more than 100,000 images. The images, which fall into five distinct categories—vertical bar graphs, horizontal bar graphs, line graphs, dot-line graphs, and pie charts—are designed to simulate common scientific figures. Each image is accompanied by questions generated from 15 distinctive templates, each leading to binary yes/no answers. This setup enforces the need for a model to derive answers by analyzing spatial and visual information effectively.

2. Question and Visual Complexity

Questions within FigureQA involve complex reasoning and understanding across visual plots, investigating attributes such as maximum, minimum, area-under-the-curve, smoothness, and intersection. Models are challenged to spatially integrate information distributed throughout the image to extract abstract relationships, replicating tasks that require immediate comprehension and interpretation of data by human analysts.

3. Reasoning and Statistical Requirements

The underlying tasks embedded within FigureQA necessitate the integration of relational reasoning and statistical analysis. For instance, questions evaluating the "smoothest" graph require computing a metric based on differences between consecutive data points. Consequently, solving these questions entails not only detecting elements but also computing and comparing values extracted from the visuals themselves.

4. Auxiliary and Training Data

The dataset is supplemented by numerical data used to generate each visualization and bounding-box annotations for the elements within the figures, such as data points and axes. This auxiliary information supports the development of additional machine learning tasks, like data reconstruction, and aids in supervised learning applications by providing necessary ground-truth annotations.

5. Model Evaluations and Benchmarks

Several baseline models, including various configurations of CNN-LSTM and Relational Network architectures, have been evaluated on FigureQA. The performances demonstrate a significant gap between machine efficiency and human capability, with humans achieving approximately 91% accuracy compared to the best machine performance using Relation Networks at about 72%. These results highlight both the challenge posed by FigureQA and the potential for substantial improvement by future models.

6. Implications for Machine Learning

FigureQA establishes a benchmark for developing models capable of performing visual reasoning tasks beyond basic object recognition, offering a synthetic but robust environment for evaluating the ability to understand and reason about complex data visualizations. The dataset offers control and extensibility due to its synthetic nature, allowing for precise task formulation without the confounds present in real-world images. Its focus on relational and statistical reasoning serves as a crucial step towards evolving models that can effectively read and interpret visual data akin to human analysis. Through continued exploration and model development, FigureQA is a cornerstone for advancing visual question answering and reasoning technologies.

Markdown Report Issue Upgrade to Chat

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FigureQA.

FigureQA Visual Reasoning Dataset

1. Dataset Composition and Structure

2. Question and Visual Complexity

3. Reasoning and Statistical Requirements

4. Auxiliary and Training Data

5. Model Evaluations and Benchmarks

6. Implications for Machine Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

FigureQA Visual Reasoning Dataset

1. Dataset Composition and Structure

2. Question and Visual Complexity

3. Reasoning and Statistical Requirements

4. Auxiliary and Training Data

5. Model Evaluations and Benchmarks

6. Implications for Machine Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research