Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GenPlot: Increasing the Scale and Diversity of Chart Derendering Data (2306.11699v1)

Published 20 Jun 2023 in cs.CV

Abstract: Vertical bars, horizontal bars, dot, scatter, and line plots provide a diverse set of visualizations to represent data. To understand these plots, one must be able to recognize textual components, locate data points in a plot, and process diverse visual contexts to extract information. In recent works such as Pix2Struct, Matcha, and Deplot, OCR-free chart-to-text translation has achieved state-of-the-art results on visual language tasks. These results outline the importance of chart-derendering as a pre-training objective, yet existing datasets provide a fixed set of training examples. In this paper, we propose GenPlot; a plot generator that can generate billions of additional plots for chart-derendering using synthetic data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Vqa: Visual question answering.
  2. Reading and reasoning over chart images for evidence-based automated fact-checking. In Findings of the Association for Computational Linguistics: EACL 2023, pages 399–414, Dubrovnik, Croatia. Association for Computational Linguistics.
  3. Benetech - making graphs accessible.
  4. Pali: A jointly-scaled multilingual language-image model.
  5. Array programming with NumPy. Nature, 585(7825):357–362.
  6. J. D. Hunter. 2007. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95.
  7. Dvqa: Understanding data visualizations via question answering.
  8. Figureqa: An annotated figure dataset for visual reasoning.
  9. Prestu: Pre-training for scene-text understanding.
  10. Ocr-free document understanding transformer.
  11. Pix2struct: Screenshot parsing as pretraining for visual language understanding.
  12. Deplot: One-shot visual language reasoning by plot-to-table translation.
  13. Matcha: Enhancing visual language pretraining with math reasoning and chart derendering.
  14. Chartqa: A benchmark for question answering about charts with visual and logical reasoning.
  15. Plotqa: Reasoning over scientific plots.
  16. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
  17. LayoutLM: Pre-training of text and layout for document image understanding. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Brendan Artley (2 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.