Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scatteract: Automated extraction of data from scatter plots (1704.06687v1)

Published 21 Apr 2017 in cs.CV, cs.IR, and stat.ML

Abstract: Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mathieu Cliche (7 papers)
  2. David Rosenberg (12 papers)
  3. Dhruv Madeka (16 papers)
  4. Connie Yee (1 paper)
Citations (66)

Summary

We haven't generated a summary for this paper yet.