Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows (2404.02081v1)
Abstract: Explainable AI (XAI) tools represent a turn to more human-centered and human-in-the-loop AI approaches that emphasize user needs and perspectives in machine learning model development workflows. However, while the majority of ML resources available today are developed for Python computational environments such as JupyterLab and Jupyter Notebook, the same has not been true of interactive XAI systems, which are often still implemented as standalone interfaces. In this paper, we address this mismatch by identifying three design patterns for embedding front-end XAI interfaces into Jupyter, namely: 1) One-way communication from Python to JavaScript, 2) Two-way data synchronization, and 3) Bi-directional callbacks. We also provide an open-source toolkit, bonXAI, that demonstrates how each design pattern might be used to build interactive XAI tools for a Pytorch text classification workflow. Finally, we conclude with a discussion of best practices and open questions. Our aims for this paper are to discuss how interactive XAI tools might be developed for computational notebooks, and how they can better integrate into existing model development workflows to support more collaborative, human-centered AI.
- Neva: Visual analytics to identify fraudulent networks. In Computer Graphics Forum, Vol. 39. Wiley Online Library, 344–359.
- TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
- Mark S Ackerman. 2000. The intellectual challenge of CSCW: the gap between social requirements and technical feasibility. Human–Computer Interaction 15, 2-3 (2000), 179–203.
- Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access 6 (2018), 52138–52160.
- Tribe or not? Critical inspection of group differences using TribalGram. ACM Transactions on Interactive Intelligent Systems (TiiS) 12, 1 (2022), 1–34.
- J Alammar. 2021. Ecco: An Open Source Library for the Explainability of Transformer Language Models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 249–257.
- Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information fusion 58 (2020), 82–115.
- One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv preprint arXiv:1909.03012 (2019).
- CausalNex. https://github.com/quantumblacklabs/causalnex
- Benjamin B Bederson. 2004. Interfaces for staying in the flow. Ubiquity 5, 27 (2004), 1.
- On Selective, Mutable and Dialogic XAI: a Review of What Users Say about Different Types of Interactive Explanations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–21.
- Explainable machine learning in deployment. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 648–657.
- Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–14.
- ” Hello AI”: uncovering the onboarding needs of medical practitioners for human-AI collaborative decision-making. Proceedings of the ACM on Human-computer Interaction 3, CSCW (2019), 1–24.
- François Chollet et al. 2015. Keras. https://keras.io.
- Deep reinforcement learning from human preferences. Advances in neural information processing systems 30 (2017).
- Intelligent tutoring systems. In Handbook of human-computer interaction. Elsevier, 849–874.
- Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at Explanations Across the AI Lifecycle. In Designing Interactive Systems Conference 2021. 1591–1602.
- Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).
- Expanding explainability: Towards social transparency in ai systems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–19.
- The who in explainable ai: How ai background shapes perceptions of ai explanations. arXiv preprint arXiv:2107.13509 (2021).
- Upol Ehsan and Mark O Riedl. 2020. Human-centered explainable ai: Towards a reflective sociotechnical approach. In HCI International 2020-Late Breaking Papers: Multimodality and Intelligence: 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings 22. Springer, 449–466.
- Charting the Sociotechnical Gap in Explainable AI: A Framework to Address the Gap in XAI. Proceedings of the ACM on Human-Computer Interaction 7, CSCW1 (2023), 1–32.
- The algorithmic imprint. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1305–1317.
- Operationalizing human-centered perspectives in explainable AI. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–6.
- Human-Centered Explainable AI (HCXAI): beyond opening the black-box of AI. In CHI conference on human factors in computing systems extended abstracts. 1–7.
- eli5 community. 2021. Welcome to ELI5’s documentation! https://eli5.readthedocs.io/en/latest/ Accessed Feb 20, 2024.
- Hugging Face. [n. d.]. Text classification. https://huggingface.co/docs/transformers/en/tasks/sequence_classification Accessed Feb 20, 2024.
- Explainable artificial intelligence for education and training. The Journal of Defense Modeling and Simulation 19, 2 (2022), 133–144.
- PyTorch library for CAM methods. https://github.com/jacobgil/pytorch-grad-cam.
- Vaine: Visualization and ai for natural experiments. In 2021 IEEE Visualization Conference (VIS). IEEE, 21–25.
- Causalvis: Visualizations for Causal Inference. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–20.
- Alexa Hagerty and Igor Rubinov. 2019. Global AI ethics: a review of the social impacts and ethical implications of artificial intelligence. arXiv preprint arXiv:1907.07892 (2019).
- Gillian R Hayes. 2011. The relationship of action research to human-computer interaction. ACM Transactions on Computer-Human Interaction (TOCHI) 18, 3 (2011), 1–20.
- Human factors in model interpretability: Industry practices, challenges, and needs. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1–26.
- J. D. Hunter. 2007. Matplotlib: A 2D graphics environment. Computing in Science & Engineering 9, 3 (2007), 90–95. https://doi.org/10.1109/MCSE.2007.55
- Plotly Technologies Inc. 2015. Collaborative data science. Montreal, QC. https://plot.ly
- Kaggle. 2022. State of Data Science and Machine Learning 2022. https://www.kaggle.com/kaggle-survey-2022 Accessed Feb 20, 2024.
- Toonnote: Improving communication in computational notebooks using interactive data comics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.
- The story in the notebook: Exploratory data science using a literate programming tool. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1–11.
- mage: Fluid moves between code and graphical work in computational notebooks. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 140–151.
- Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence 3 (2022), 100074.
- ” Help Me Help the AI”: Understanding How Explainability Can Support Human-AI Interaction. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
- Jupyter Notebooks-a publishing format for reproducible computational workflows. Elpub 2016 (2016), 87–90.
- W Bradley Knox and Peter Stone. 2008. Tamer: Training an agent manually via evaluative reinforcement. In 2008 7th IEEE international conference on development and learning. IEEE, 292–297.
- Donald Ervin Knuth. 1984. Literate programming. The computer journal 27, 2 (1984), 97–111.
- Captum: A unified and generic model interpretability library for PyTorch. arXiv:2009.07896 [cs.LG]
- Bot-Detective: An explainable Twitter bot detection service with crowdsourcing functionalities. In Proceedings of the 12th International Conference on Management of Digital EcoSystems. 55–63.
- Towards reliable interactive data cleaning: A user survey and recommendations. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–5.
- Activeclean: Interactive data cleaning for statistical modeling. Proceedings of the VLDB Endowment 9, 12 (2016), 948–959.
- Illustrating Reinforcement Learning from Human Feedback (RLHF). Hugging Face Blog (2022). https://huggingface.co/blog/rlhf.
- What do we want from Explainable Artificial Intelligence (XAI)?–A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research. Artificial Intelligence 296 (2021), 103473.
- Eva: Visual analytics to identify fraudulent events. IEEE transactions on visualization and computer graphics 24, 1 (2017), 330–339.
- Notable: On-the-fly Assistant for Data Storytelling in Computational Notebooks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–16.
- Questioning the AI: informing design practices for explainable AI user experiences. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–15.
- Q Vera Liao and Kush R Varshney. 2021. Human-centered explainable ai (xai): From algorithms to user experiences. arXiv preprint arXiv:2110.10790 (2021).
- Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 4765–4774. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf
- Interactive learning from policy-dependent human feedback. In International conference on machine learning. PMLR, 2285–2294.
- Trevor Manz. [n. d.]. anywidget. https://github.com/manzt/anywidget
- The cost of interrupted work: more speed and stress. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 107–110.
- Action design research and visualization design. In Proceedings of the Sixth Workshop on Beyond Time and Errors on Novel Evaluation Methods for Visualization. 10–18.
- UMAP: Uniform Manifold Approximation and Projection. The Journal of Open Source Software 3, 29 (2018), 861.
- Design activity framework for visualization design. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 2191–2200.
- Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence 267 (2019), 1–38.
- InterpretML: A Unified Framework for Machine Learning Interpretability. arXiv preprint arXiv:1909.09223 (2019).
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
- The science of interaction. Information visualization 8, 4 (2009), 263–274.
- Lodestar: Supporting rapid prototyping of data science workflows through data-driven analysis recommendations. Information Visualization 23, 1 (2024), 21–39.
- ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. 1135–1144.
- A survey on data collection for machine learning: a big data-ai integration perspective. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2019), 1328–1347.
- Evaluating the interpretability of generative models by interactive reconstruction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
- Machine Learning in Cardiovascular Imaging: A Scoping Review of Published Literature. Current Radiology Reports 11, 2 (Feb. 2023), 34–45. https://doi.org/10.1007/s40134-022-00407-8
- Making scientific computations reproducible. Computing in Science & Engineering 2, 6 (2000), 61–67.
- Design study methodology: Reflections from the trenches and the stacks. IEEE transactions on visualization and computer graphics 18, 12 (2012), 2431–2440.
- Applications of artificial intelligence in cardiovascular imaging. Nature Reviews Cardiology 18, 8 (Aug. 2021), 600–609. https://doi.org/10.1038/s41569-021-00527-2
- Helen Shen. 2014. Interactive notebooks: Sharing the code. Nature 515, 7525 (2014), 152–152.
- Francesco Sovrano and Fabio Vitali. 2021. From Philosophy to Interfaces: An Explanatory Method and a Tool Inspired by Achinstein’s Theory of Explanation. In 26th International Conference on Intelligent User Interfaces. 81–91.
- A Survey of Human-Centered Evaluations in Human-Centered Machine Learning. In Computer Graphics Forum, Vol. 40. Wiley Online Library, 543–568.
- explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE transactions on visualization and computer graphics 26, 1 (2019), 1064–1074.
- Design Study” Lite” Methodology: Expediting Design Studies and Enabling the Synergy of Visualization Pedagogy and Social Good. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
- Artificial intelligence in cardiac imaging: where we are and what we want. European Heart Journal 44, 7 (Feb. 2023), 541–543. https://doi.org/10.1093/eurheartj/ehac700
- Altair: Interactive Statistical Visualizations for Python. Journal of Open Source Software 3, 32 (2018), 1057. https://doi.org/10.21105/joss.01057
- Yoland Wadsworth. 1993. What is participatory action research? Action Research Issues Association.
- Slide4N: Creating Presentation Slides from Computational Notebooks with Human-AI Collaboration. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18.
- Stickyland: Breaking the linear presentation of computational notebooks. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–7.
- Nova: A practical method for creating notebook-ready visual analytics. arXiv preprint arXiv:2205.03963 (2022).
- Deep tamer: Interactive agent shaping in high-dimensional state spaces. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Michael L. Waskom. 2021. seaborn: statistical data visualization. Journal of Open Source Software 6, 60 (2021), 3021. https://doi.org/10.21105/joss.03021
- Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359 (2021).
- Fork it: Supporting stateful alternatives in computational notebooks. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–12.
- Steven Euijong Whang and Jae-Gil Lee. 2020. Data collection and quality challenges for deep learning. Proceedings of the VLDB Endowment 13, 12 (2020), 3429–3432.
- Data collection and quality challenges in deep learning: A data-centric ai perspective. The VLDB Journal 32, 4 (2023), 791–813.
- Jupyter widgets community. 2023. Jupyter Widgets. https://ipywidgets.readthedocs.io/en/stable/ Accessed Feb 20, 2024.
- A survey of preference-based reinforcement learning methods. Journal of Machine Learning Research 18, 136 (2017), 1–46.
- B2: Bridging code and interactive visualization in computational notebooks. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 152–165.
- OmniXAI: A Library for Explainable AI. (2022). https://doi.org/10.48550/ARXIV.2206.01612 arXiv:206.01612
- Telling stories from computational notebooks: Ai-assisted presentation slides creation for presenting data science work. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–20.