Do You See What I See? A Qualitative Study Eliciting High-Level Visualization Comprehension (2402.15605v1)
Abstract: Designers often create visualizations to achieve specific high-level analytical or communication goals. These goals require people to naturally extract complex, contextualized, and interconnected patterns in data. While limited prior work has studied general high-level interpretation, prevailing perceptual studies of visualization effectiveness primarily focus on isolated, predefined, low-level tasks, such as estimating statistical quantities. This study more holistically explores visualization interpretation to examine the alignment between designers' communicative goals and what their audience sees in a visualization, which we refer to as their comprehension. We found that statistics people effectively estimate from visualizations in classical graphical perception studies may differ from the patterns people intuitively comprehend in a visualization. We conducted a qualitative study on three types of visualizations -- line graphs, bar graphs, and scatterplots -- to investigate the high-level patterns people naturally draw from a visualization. Participants described a series of graphs using natural language and think-aloud protocols. We found that comprehension varies with a range of factors, including graph complexity and data distribution. Specifically, 1) a visualization's stated objective often does not align with people's comprehension, 2) results from traditional experiments may not predict the knowledge people build with a graph, and 3) chart type alone is insufficient to predict the information people extract from a graph. Our study confirms the importance of defining visualization effectiveness from multiple perspectives to assess and inform visualization practices.
- Eytan Adar and Elsie Lee. 2020. Communicative visualizations as a learning problem. IEEE Trans Vis Comput Graph 27, 2 (2020), 946–956.
- Afghanistan population dataset 2023. Afghanistan population dataset. https://data.worldbank.org/indicator/SP.POP.TOTL?locations=AF.
- Airlines 2022. Airlines Dataset. https://www.bts.gov/newsroom/2022-annual-and-4th-quarter-us-airline-financial-data.
- Dataset to support the adoption of social media and emerging technologies for students’ continuous engagement. Data in Brief 31 (2020), 105926.
- Task-driven evaluation of aggregation in time series visualization. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 551–560.
- Low-level components of analytic activity in information visualization. In IEEE Symposium on Information Visualization (InfoVis). 111–117.
- Auto MPG 2017. Auto MPG Dataset. https://www.kaggle.com/datasets/uciml/autompg-dataset.
- Top-down versus bottom-up attentional control: A failed theoretical dichotomy. Trends Cogn. Sci. 16, 8 (2012), 437–443.
- Perceiving relationships: A physiological examination of the perception of scatterplots. In Proc. 4th Conference on Diagrammatic Representation and Inference.
- Jurassic mark: Inattentional blindness for a datasaurus reveals that visualizations are explored, not seen. In 2021 IEEE Visualization Conference (VIS). IEEE, 71–75.
- Beyond memorability: Visualization recognition and recall. IEEE Trans Vis Comput Graph 22, 1 (2015), 519–528.
- What makes a visualization memorable? IEEE Trans Vis Comput Graph 19, 12 (2013), 2306–2315.
- Enabling Longitudinal Exploratory Analysis of Clinical COVID Data. In IEEE Workshop on Visual Analytics in Healthcare (VAHC). 19–24.
- Matthew Brehmer and Tamara Munzner. 2013. A multi-level typology of abstract visualization tasks. IEEE Trans Vis Comput Graph 19, 12 (2013), 2376–2385.
- Who Do We Mean When We Talk About Visualization Novices?. In Proc. 2023 ACM SIGCHI Hum Factor Comput Syst. 1–16.
- How to evaluate data visualizations across different levels of understanding. In IEEE Workshop on Evaluation and Beyond-Methodological Approaches to Visualization (BELIV). IEEE, 19–28.
- Calories Burned 2020. Calories Burned During Exercise and Activities. https://www.kaggle.com/datasets/aadhavvignesh/calories-burned-during-exercise-and-activities.
- Car Model Dataset 2023. Car Model Dataset. https://www.kaggle.com/datasets/peshimaammuzammil/2023-car-model-dataset-all-data-you-need.
- Mackinlay Card. 1999. Readings in information visualization: using vision to think. Morgan Kaufmann.
- Subjectivity in personal storytelling with visualization. Inf. Des. J. 23, 1 (2017), 48–64.
- Certificated Air Carrier 2021. Certificated Air Carrier Fuel Consumption and Travel. https://www.bts.gov/content/certificated-air-carrier-fuel-consumption-and-travel.
- The generalized sensitivity scatterplot. IEEE Trans Vis Comput Graph 19, 10 (2013), 1768–1781.
- Defining insight for visual analytics. IEEE Comput Graph Appl 29, 2 (2009), 14–17.
- How Do Captions Affect Visualization Reading? arXiv preprint (2022).
- Evaluating the impact of visualization of wildfire hazard upon decision-making under uncertainty. Int J Geogr Inf Sci 30, 7 (2016), 1377–1404.
- William Cleveland and Robert McGill. 1984. Graphical perception: Theory, experimentation, and application to the development of graphical methods. J. Amer. Statist. Assoc. 79, 387 (1984), 531–554.
- William S Cleveland and Robert McGill. 1986. An experiment in graphical perception. International Journal of Man-Machine Studies 25, 5 (1986), 491–500.
- Visual attention: bottom-up versus top-down. Current Biology 14, 19 (2004), R850–R852.
- Comparing averages in time series data. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 1095–1104.
- Michael Correll and Jeffrey Heer. 2017a. Black hat visualization. In IEEE Workshop on Dealing with Cognitive Biases in Visualisations (DECISIVe), Vol. 1. 10.
- Michael Correll and Jeffrey Heer. 2017b. Regression by eye: Estimating trends in bivariate visualizations. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 1387–1396.
- CPU Specifications 2022. CPU Specifications Dataset. https://www.kaggle.com/datasets/lincolnzh/cpu-specifications-dataset.
- Will Cukierski. 2012. Titanic - Machine Learning from Disaster. https://kaggle.com/competitions/titanic
- S Dixon. 2022. Average daily time spent on social media worldwide 2012-2022. Erişim Tarihi 22 (2022), 2022.
- A Design Space of Vision Science Methods for Visualization Research. IEEE Trans Vis Comput Graph (2020).
- The science of visual data communication: What works. Psychol. Sci. Public Interest 22, 3 (2021), 110–161. https://doi.org/10.1177/15291006211051956
- Detection of benign and malignant tumors in skin empowered with transfer learning. Comput. Intell. Neurosci. 2022 (2022).
- JJ Gibson. 1972. A theory of direct visual perception. In the Psychology of Knowing, ed. JR Royce, WW Roze-boom, 215-27. New York: Gordon & Breach 63 (1972), 396–97.
- Michael Gleicher. 2017. Considerations for visualizing comparison. IEEE Trans Vis Comput Graph 24, 1 (2017), 413–423.
- Visual comparison for information visualization. Inf Vis 10, 4 (2011), 289–309.
- Perception of average value in multiclass scatterplots. IEEE Trans Vis Comput Graph 19 (2013).
- Global Summary 2023. Global Summary of the Day station observations. http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/.CPC/.GLOBAL/.STATION.cuf/.
- Comparing similarity perception in time series visualizations. IEEE Trans Vis Comput Graph (2018).
- Goverment Bond Dataset 2023. Goverment Bond Dataset. https://fred.stlouisfed.org/series/IRLTLT01USM156N.
- The relation between visualization size, grouping, and user performance. IEEE Trans Vis Comput Graph (2014).
- How information visualization novices construct visualizations. IEEE Trans Vis Comput Graph 16, 6 (2010), 943–952.
- Grand Slam Title Winners 2023. Grand Slam Title Winners. https://www.kaggle.com/datasets/abrafey/mens-womans-grand-slam-title-winners.
- Richard Langton Gregory. 1974. Concepts and mechanisms of perception. Charles Scribner’s Sons.
- Professional differences: A comparative study of visualization task performance and spatial ability across disciplines. IEEE Trans Vis Comput Graph 28, 1 (2021), 654–664.
- Influencing visual judgment through affective priming. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI).
- Ranking visualizations of correlation using weber’s law. IEEE Trans Vis Comput Graph 20, 12 (2014), 1943–1952.
- Jeffrey Heer and Michael Bostock. 2010. Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI).
- Creation and collaboration: Engaging new audiences for information visualization. Information Visualization: Human-Centered Issues and Perspectives (2008), 92–133.
- Historical Weather Dataset 2017. Historical Weather Dataset. https://www.kaggle.com/datasets/selfishgene/historical-hourly-weather-data.
- The Weighted Average Illusion: Biases in Perceived Mean Position in Scatterplots. IEEE Trans Vis Comput Graph 28, 1 (2021), 987–997.
- palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1. 0 (2020).
- Household monthly electricity bill 2020. Household monthly electricity bill. https://www.kaggle.com/datasets/gireeshs/household-monthly-electricity-bill.
- Jessica Hullman and Nick Diakopoulos. 2011. Visualization rhetoric: Framing effects in narrative visualization. IEEE Trans Vis Comput Graph 17, 12 (2011), 2231–2240.
- IMDB 2019. IMDB Dataset. https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews.
- Waqas Javed and Niklas Elmqvist. 2012. Exploring the design space of composite visualization. In IEEE Pacific Visualization Symposium (PacificVis). IEEE, 1–8.
- Graphical Perception of Multiple Time Series. IEEE Trans Vis Comput Graph 16, 6 (2010), 927–934.
- Clams: a cluster ambiguity measure for estimating perceptual variability in visual clustering. IEEE Trans Vis Comput Graph (2023).
- A rapidly deployed, interactive, online visualization system to support fatality management during the coronavirus disease 2019 (COVID-19) pandemic. J Am Med Inform Assoc 27, 12 (2020), 1943–1948.
- Matthew Kay and Jeffrey Heer. 2016. Beyond weber’s law: A second look at ranking visualizations of correlation. IEEE Trans Vis Comput Graph 22 (2016).
- BubbleView: an interface for crowdsourcing image importance maps and tracking visual attention. ACM Trans Comput Hum Interact 24, 5 (2017), 1–40.
- Younghoon Kim and Jeffrey Heer. 2018. Assessing effects of task and data distribution on the effectiveness of visual encodings. Comput Graph Forum 37, 3 (2018), 157–167.
- Data through others’ eyes: The impact of visualizing others’ expectations on visualization interpretation. IEEE Trans Vis Comput Graph 24, 1 (2017), 760–769.
- Ron Kohavi et al. 1996. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid.. In Proc. ACM SIGKDD Int. (KDD), Vol. 96. 202–207.
- Frames and slants in titles of visualizations on controversial topics. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 1–12.
- Trust and recall of information across varying degrees of title-visualization misalignment. In Proc. 2019 ACM SIGCHI Hum Factor Comput Syst. 1–13.
- Robert Kosara. 2016. An empire built on sand: Reexamining what we think we know about visualization. In IEEE Workshop on Evaluation and Beyond-Methodological Approaches to Visualization (BELIV).
- Jon A Krosnick. 2018. Questionnaire design. The Palgrave handbook of survey research (2018), 439–455.
- Elsie Lee-Robbins and Eytan Adar. 2022. Affective Learning Objectives for Communicative Visualizations. IEEE Trans Vis Comput Graph 29, 1 (2022), 1–11.
- A model of symbol size discrimination in scatterplots. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 2553–2562. https://doi.org/10.1145/1753326.1753714
- Selecting Semantically-Resonant Colors for Data Visualization. Comput Graph Forum 32, 3pt4 (2013), 401–410. https://doi.org/10.1111/cgf.12127
- Descriptive, cross-country analysis of the nurse practitioner workforce in six countries: size, growth, physician substitution potential. BMJ Open 6, 9 (2016), e011901.
- Physical activity behavior before, during, and after COVID-19 restrictions: longitudinal smartphone-tracking study of adults in the United Kingdom. J. Medical Internet Res. 23, 2 (2021), e23701.
- Hurricane risk communication: visualization and behavioral science concepts. Weather Clim Soc 12, 2 (2020), 193–211.
- Minnesota Agricultural Product 2023. Minnesota Agricultural Product. https://www.nass.usda.gov/Statistics_by_State/Minnesota/Publications/County_Estimates.
- Movie Dataset 2023. Movie Dataset: Budgets, Genres, Insights. https://www.kaggle.com/datasets/utkarshx27/movies-dataset.
- Context matters: A theory of semantic discriminability for perceptual encoding systems. IEEE Trans Vis Comput Graph 28, 1 (2021), 697–706.
- Hendrik Müller and Aaron Sedley. 2015. Designing surveys for HCI research. In Extended Abstracts of the 2015 ACM SIGCHI Hum Factor Comput Syst. 2485–2486.
- Tamara Munzner. 2009. A nested model for visualization design and validation. IEEE Trans Vis Comput Graph 15, 6 (2009), 921–928.
- Tamara Munzner. 2014. Visualization analysis and design. CRC press.
- Museum visitors dataset 2019. Museum visitors dataset. https://www.opendatanetwork.com/dataset/data.lacity.org/trxm-jn3c.
- Ranked-List Visualization: A Graphical Perception Study. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 192.
- National Health 2023. National Health Expenditure Data. https://www.cms.gov/research-statistics-data-and-systems/statistics-trends-and-reports/nationalhealthexpenddata.
- NOAA Monthly 2023. NOAA Monthly U.S. Climate Divisional Database. https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00005.
- Chris North. 2006. Toward measuring visualization insight. IEEE Comput Graph Appl 26, 3 (2006), 6–9.
- Face to face: Evaluating visual comparison. IEEE Trans Vis Comput Graph 25, 1 (2018), 861–871.
- Data is personal: Attitudes and perceptions of data visualization in rural pennsylvania. In Proc. 2019 ACM SIGCHI Hum Factor Comput Syst. 1–12.
- Peter Pirolli and Stuart Card. 2005. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Intelligence Analysis, Vol. 5. 2–4.
- Promoting insight-based evaluation of visualizations: From contest to benchmark repository. IEEE Trans Vis Comput Graph 14, 1 (2007), 120–134.
- PM 2.5 dataset 2016. PM 2.5 dataset. https://data.cdc.gov/browse/select_dataset?tags=pm2.5.
- Uncertainty-aware visualization for analyzing heterogeneous wildfire detections. IEEE Comput Graph Appl 39, 5 (2019), 72–82.
- Product Clustering Dataset 2020. Product Clustering Dataset. https://www.kaggle.com/datasets/lakritidis/product-classification-and-categorization.
- Programming 2023. Programming Languages Dataset. https://www.kaggle.com/datasets/muhammadkhalid/most-popular-programming-languages-since-2004.
- Protein Products 2022. Protein Products Market Dataset. https://www.statista.com/topics/4232/protein-market.
- Automatic Scatterplot Design Optimization for Clustering Identification. IEEE Trans Vis Comput Graph (2022).
- Ghulam Jilani Quadri and Paul Rosen. 2020. Modeling the influence of visual density on cluster perception in scatterplots using topology. IEEE Trans Vis Comput Graph 27, 2 (2020), 1829–1839.
- Ghulam Jilani Quadri and Paul Rosen. 2021. A survey of perception-based visualization studies by task. IEEE Trans Vis Comput Graph (2021).
- Ronald Rensink and Gideon Baldridge. 2010. The perception of correlation in scatterplots. Comput Graph Forum 29, 3 (2010), 1203–1210.
- Coronavirus pandemic (COVID-19). Our World in Data (2020).
- Agricultural Production. Our World in Data (2023). https://ourworldindata.org/agricultural-production.
- CO22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT and Greenhouse Gas Emissions. Our World in Data (2020). https://ourworldindata.org/co2-and-greenhouse-gas-emissions.
- Brian Rogers. 2022. Cues, clues and the cognitivisation of perception: Do words matter? Perception 51, 5 (2022), 295–299.
- Paul Rosen and Ghulam Jilani Quadri. 2020. Linesmooth: An analytical framework for evaluating the effectiveness of smoothing techniques on line charts. IEEE Trans Vis Comput Graph 27, 2 (2020), 1536–1546.
- Non-expert interpretations of hurricane forecast uncertainty visualizations. Spatial Cognition & Computation 16, 2 (2016), 154–172.
- The cost structure of sensemaking. In INTERACT and ACM SIGCHI Hum Factor Comput Syst. 269–276.
- Task-based effectiveness of basic visualizations. IEEE Trans Vis Comput Graph 25, 7 (2018), 2505–2512.
- Evaluating interactive graphical encodings for data visualization. IEEE Trans Vis Comput Graph 24 (2018).
- An insight-based methodology for evaluating bioinformatics visualizations. IEEE Trans Vis Comput Graph 11, 4 (2005), 443–456.
- An insight-based longitudinal study of visual analytics. IEEE Trans Vis Comput Graph 12, 6 (2006), 1511–1522.
- A taxonomy of visual cluster separation factors. Comput Graph Forum 31, 3pt4 (2012), 1335–1344.
- Priti Shah and Eric G Freedman. 2011. Bar and line graph comprehension: An interaction of top-down and bottom-up processes. Top Cogn Sci 3, 3 (2011), 560–578.
- Stephen Smart and Danielle Albers Szafir. 2019. Measuring the Separability of Shape, Size, and Color in Scatterplots. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 669. https://doi.org/10.1145/3290605.3300899
- Hayeong Song and Danielle Albers Szafir. 2018. Where’s My Data? Evaluating Visualizations with Missing Data. IEEE Trans Vis Comput Graph (2018). https://doi.org/10.1109/TVCG.2018.2864914
- What’s the Difference?: Evaluating Variations of Multi-Series Bar Charts for Visual Comparison Tasks. In Proc. ACM SIGCHI Hum Factor Comput Syst (CHI). 304.
- Stock Market Dataset 2020. Stock Market Dataset. https://www.kaggle.com/datasets/jacksoncrow/stock-market-dataset.
- Striking a balance: reader takeaways and preferences when integrating text and charts. IEEE Trans Vis Comput Graph 29, 1 (2022), 1233–1243.
- Supermarket sales 2019. Supermarket sales. https://www.kaggle.com/datasets/aungpyaeap/supermarket-sales.
- Danielle Albers Szafir. 2018. Modeling color difference for visualization design. IEEE Trans Vis Comput Graph 24, 1 (2018), 392–401.
- Visualization Psychology. Springer Nature.
- Four experiments on the perception of bar charts. IEEE Trans Vis Comput Graph 20, 12 (2014), 2152–2160.
- Top Spotify 2019. Top Spotify Songs. https://www.kaggle.com/datasets/leonardopena/top-spotify-songs-from-20102019-by-year.
- Measuring Categorical Perception in Color-Coded Scatterplots. In Proc. 2023 ACM SIGCHI Hum Factor Comput Syst (CHI).
- Edward Tufte. 1985. The visual display of quantitative information. J. Healthc. Qual. 7, 3 (1985), 15.
- Turtles 2013. Turtles Dataset. https://catalog.data.gov/dataset/turtle-reproductive-ecology-data-new-mexico-2012-2013.
- Unemployment dataset 2022. Unemployment dataset. https://www.kaggle.com/datasets/pantanjali/unemployment-dataset.
- UNICEF and Gallup. 2021. The Changing Childhood Project. https://changingchildhood.unicef.org/.
- US Health Insurance Dataset 2019. US Health Insurance Dataset. https://www.kaggle.com/datasets/teertha/ushealthinsurancedataset.
- Wieske van Zoest and Mieke Donk. 2004. Bottom-up and top-down control in visual search. Perception 33, 8 (2004), 927–937.
- A Comparison of Radial and Linear Charts for Visualizing Daily Patterns. IEEE Trans Vis Comput Graph (2019).
- A heuristic approach to value-driven evaluation of visualizations. IEEE Trans Vis Comput Graph 25, 1 (2018), 491–500.
- An empirical study of counterfactual visualization to support visual causal inference. Inf Vis (2024), 14738716241229437.
- Understanding data accessibility for people with intellectual and developmental disabilities. In Proc. 2021 ACM SIGCHI Hum Factor Comput Syst (CHI). 1–16.
- Comparing bar chart authoring with Microsoft Excel and tangible tiles. Comput Graph Forum 35, 3 (2016), 111–120.
- Biased average position estimates in line and bar graphs: Underestimation, overestimation, and perceptual pull. IEEE Trans Vis Comput Graph 26, 1 (2019), 301–310.
- The curse of knowledge in visual data communication. IEEE Trans Vis Comput Graph 26, 10 (2019), 3051–3062.
- Youtube dislike dataset 2021. Youtube dislike dataset. https://www.kaggle.com/datasets/dmitrynikolaev/youtube-dislikes-dataset.
- Jeff Zacks and Barbara Tversky. 1999. Bars and lines: A study of graphic communication. Mem Cognit 27 (1999), 1073–1079.
- Ghulam Jilani Quadri (16 papers)
- Arran Zeyu Wang (10 papers)
- Zhehao Wang (2 papers)
- Jennifer Adorno (2 papers)
- Paul Rosen (41 papers)
- Danielle Albers Szafir (21 papers)