Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

XAutoML: A Visual Analytics Tool for Understanding and Validating Automated Machine Learning (2202.11954v3)

Published 24 Feb 2022 in cs.LG, cs.AI, and cs.HC

Abstract: In the last ten years, various automated machine learning (AutoM ) systems have been proposed to build end-to-end ML pipelines with minimal human interaction. Even though such automatically synthesized ML pipelines are able to achieve a competitive performance, recent studies have shown that users do not trust models constructed by AutoML due to missing transparency of AutoML systems and missing explanations for the constructed ML pipelines. In a requirements analysis study with 36 domain experts, data scientists, and AutoML researchers from different professions with vastly different expertise in ML, we collect detailed informational needs for AutoML. We propose XAutoML, an interactive visual analytics tool for explaining arbitrary AutoML optimization procedures and ML pipelines constructed by AutoML. XAutoML combines interactive visualizations with established techniques from explainable artificial intelligence (XAI) to make the complete AutoML procedure transparent and explainable. By integrating XAutoML with JupyterLab, experienced users can extend the visual analytics with ad-hoc visualizations based on information extracted from XAutoML. We validate our approach in a user study with the same diverse user group from the requirements analysis. All participants were able to extract useful information from XAutoML, leading to a significantly increased understanding of ML pipelines produced by AutoML and the AutoML optimization itself.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (90)
  1. Optuna : A Next-generation Hyperparameter Optimization Framework. In International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2623–2631.
  2. Ahmed M. Alaa and Mihaela Van Der Schaar. 2018. AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning. In International Conference on Machine Learning, J. Dy and A. Krause (Eds.). PMLR, 139–148.
  3. An Empirical Evaluation of the System Usability Scale. International Journal of Human-Computer Interaction 24, 6 (2008), 574–594.
  4. Andrea Batch and Niklas Elmqvist. 2018. The Interactive Visualization Gap in Initial Exploratory Data Analysis. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 278–287.
  5. Algorithms for Hyper-Parameter Optimization. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2546–2554.
  6. CAVE: configuration assessment, visualization and evaluation. In International Conference on Learning and Intelligent Optimization, R. Battiti, M. Brunato, I. Kotsireas, and P. M. Pardalos (Eds.). Springer International Publishing, 115–130.
  7. D3: Data-Driven Documents. IEEE Transactions on Visualization and Computer Graphics 17, 12 (2011), 2301–2309.
  8. Leo Breiman. 2001. Random Forests. Machine Learning 45, 1 (2001), 5–32.
  9. John Brooke. 1996. SUS: A ’Quick and Dirty’ Usability Scale. Usability Evaluation In Industry 189, 194 (1996), 4–7.
  10. Nadia Burkart and Marco F. Huber. 2021. A Survey on the Explainability of Supervised Machine Learning. Journal of Artificial Intelligence Research 70 (2021), 245–317.
  11. A User-based Visual Analytics Workflow for Exploratory Model Analysis. Computer Graphics Forum 38, 3 (2019), 185–199.
  12. Autostacker: A Compositional Evolutionary Learning System. In Genetic and Evolutionary Computation Conference. Association for Computing Machinery, 402–409.
  13. Alibaba Clouder. 2018. Shortening Machine Learning Development Cycle with AutoML. Retrieved 01/03/2019 from https://www.alibabacloud.com/blog/shortening-machine-learning-development-cycle-with-automl_594232
  14. Jacob Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Earlbaum Associates. 20–26 pages.
  15. Anamaria Crisan and Brittany Fiore-Gartland. 2021. Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–15.
  16. Amazon SageMaker Autopilot: A White Box AutoML Solution at Scale. In Data Management for End-to-End Machine Learning. Association for Computing Machinery, 1–7.
  17. Dataiku. 2021. Dataiku. Retrieved 16/10/2023 from https://www.dataiku.com
  18. DataRobot. 2021. AI Cloud. Retrieved 30/12/2021 from https://www.datarobot.com/platform/
  19. What Does Explainable AI Really Mean? A New Conceptualization of Perspectives. In Workshop on Comprehensibility and Explanation in AI and ML, T. R. Besold and O. Kutz (Eds.). CEUR-WS.org, 1–8.
  20. AlphaD3M: Machine Learning Pipeline Synthesis. In International Conference on Machine Learning AutoML Workshop. 1–8.
  21. Trust in AutoML: Exploring Information Needs for Establishing Trust in Automated Machine Learning Systems. In International Conference on Intelligent User Interfaces. Association for Computing Machinery, 297–307.
  22. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. In International Conference on Machine Learning AutoML Workshop. 1–28.
  23. BOHB: Robust and Efficient Hyperparameter Optimization at Scale. In International Conference on Machine Learning, J. Dy and A. Krause (Eds.). PMLR, 1437–1446.
  24. Fedesoriano. 2021. Stroke Prediction Dataset. Retrieved 29/12/2021 from https://www.kaggle.com/fedesoriano/stroke-prediction-dataset
  25. Efficient and Robust Automated Machine Learning. In International Conference on Neural Information Processing Systems, C. Cortes, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). MIT Press, 2755–2763.
  26. Alex A. Freitas. 2019. Automated Machine Learning for Studying the Trade-Off Between Predictive Accuracy and Interpretability. In Machine Learning and Knowledge Extraction, A. Holzinger, P. Kieseberg, A M. Tjoa, and E. Weippl (Eds.). Springer International Publishing, 48–66.
  27. Jerome Friedman. 2001. Greedy Function Approximation : A Gradient Boosting Machine. The Annals of Statistics 29, 5 (2001), 1189–1232.
  28. Towards Human-Guided Machine Learning. In International Conference on Intelligent User Interfaces. Association for Computing Machinery, 614–624.
  29. Boxer: Interactive Comparison of Classifier Results. In Computer Graphics Forum, H. Hauser and P. Alliez (Eds.). Wiley, 181–193.
  30. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics 24, 1 (2015), 44–65.
  31. Google Vizier: A Service for Black-Box Optimization. In International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 1487–1495.
  32. Google LLC. 2022. Google Search Trends. Retrieved 16/10/2023 from https://trends.google.com/trends/explore?date=today5-y&q=automl,machinelearning
  33. The Semantic Snake Charmer Search Engine: A Tool to Facilitate Data Science in High-tech Industry Domains. In Conference on Human Information Interaction and Retrieval. Association for Computing Machinery, 355–359.
  34. H2O.ai. 2019. H2O AutoML. Retrieved 08/11/2019 from http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html
  35. ClaVis: An Interactive Visual Comparison System for Classifiers. In Conference on Advanced Visual Interfaces. Association for Computing Machinery, 1–9.
  36. Sture Holm. 1979. A Simple Sequentially Rejective Multiple Test Procedure. Source: Scandinavian Journal of Statistics 6, 2 (1979), 65–70.
  37. Sequential Model-Based Optimization for General Algorithm Configuration. In International Conference on Learning and Intelligent Optimization, C. A. C. Coello (Ed.). Springer Berlin, Heidelberg, 507–523.
  38. An Efficient Approach for Assessing Hyperparameter Importance. In International Conference on Machine Learning, E. P. Xing and T. Jebara (Eds.). Association for Computing Machinery, 754–762.
  39. ParamILS: An Automatic Algorithm Configuration Framework. Journal of Artificial Intelligence Research 36, 1 (2009), 267–306.
  40. Alfred Inselberg and Bernard Dimsdale. 1990. Parallel Coordinates: A Tool for Visualizing Multi-Dimensional Geometry. In IEEE Conference on Visualization, A. Kaufman (Ed.). IEEE Computer Society Press, 361–378.
  41. ExploreKit: Automatic feature generation and selection. In IEEE International Conference on Data Mining, R. Gottumukkala, X. Ning, G. Dong, V. Raghavan, S. Aluru, G. Karypis, L. Miele, and X. Wu (Eds.). IEEE, 979–984.
  42. Jupyter Notebooks—a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, B. Schmidt and F. Loizides (Eds.). IOS Press BV, 87–90.
  43. Auto-WEKA 2.0: Automatic Model Selection and Hyperparameter Optimization in WEKA. Journal of Machine Learning Research 18, 1 (2017), 826–830.
  44. H. W. Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.
  45. Building machines that learn and think like people. Behavioral and Brain Sciences 40 (2017), 1–58.
  46. One button machine for automating feature engineering in relational databases. arXiv preprint arXiv:1706.00327 (2017), 1–9.
  47. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research 18, 1 (2017), 6765–6816.
  48. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–15.
  49. Tune: A Research Platform for Distributed Model Selection and Training. arXiv preprint arXiv: 1807.05118 (2018), 1–8.
  50. Zachary C. Lipton. 2018. The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability is Both Important and Slippery. Queue 16, 3 (2018), 31–57.
  51. Auptimizer - an Extensible, Open-Source Framework for Hyperparameter Tuning. In IEEE International Conference on Big Data, C. Baru, J. Huan, L. Khan, X. Hu, R. Ak, Y. Tian, R. Barga, C. Zaniolo, K. Lee, and Y. F. Ye (Eds.). IEEE, 339–348.
  52. Wes McKinney. 2010. Data Structures for Statistical Computing in Python. In Python in Science Conference, S. van der Walt and J. Millman (Eds.). SciPy, 56–61.
  53. Meta Platforms Inc. 2013. React - A JavaScript library for building user interfaces. Retrieved 16/10/2023 from https://reactjs.org/
  54. Microsoft. 2021a. Azure Machine Learning. Retrieved 16/10/2023 from https://azure.microsoft.com/en-us/products/machine-learning/
  55. Microsoft. 2021b. Neural Network Intelligence. Retrieved 16/10/2023 from https://github.com/microsoft/nni
  56. ML-Plan: Automated machine learning via hierarchical planning. Machine Learning 107, 8 (2018), 1495–1515.
  57. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. ACM Transactions on Interactive Intelligent Systems 11, 3-4 (2021), 1–45. https://doi.org/10.1145/3387166 arXiv:1811.11839
  58. Christoph Molnar. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. LeanPub.
  59. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–14.
  60. Randal S. Olson and Jason H. Moore. 2016. TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning. In International Conference on Machine Learning AutoML Workshop. Springer International Publishing, 66–74.
  61. PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2021), 390–400.
  62. VisualHyperTuner: Visual Analytics for User-Driven Hyperparameter Tuning of Deep Neural Networks. In SysML Conference, I. Dhillon, D. Papailiopoulos, and V. Sze (Eds.). mlsys.org, 1–2.
  63. HyperTendril: Visual Analytics for User-Driven Hyperparameter Optimization of Deep Neural Networks. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2021), 1407–1416.
  64. Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2, 11 (1901), 559–572.
  65. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
  66. Aleksandra Płońska and Piotr Płoński. 2021. MLJAR: State-of-the-art Automated Machine Learning Framework for Tabular Data. Version 0.10.3. Retrieved 16/10/2023 from https://github.com/mljar/mljar-supervised
  67. Manipulating and Measuring Model Interpretability. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–67.
  68. Project Jupyter. 2018. JupyterLab. Retrieved 16/10/2023 from https://jupyter.org/
  69. Recharts Group. 2015. Recharts - A Composable Charting Library Built on React Components. Retrieved 16/10/2023 from https://recharts.org/
  70. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In International Conference on Knowledge Discovery and Data Mining, B. Krishnapuram and M. Shah (Eds.). Association for Computing Machinery, 1135–1144.
  71. Anchors: High-Precision Model-Agnostic Explanations. AAAI Conference on Artificial Intelligence 32, 1 (2018), 1527–1535.
  72. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault (Eds.). Association for Computational Linguistics, 4902–4912.
  73. Exploration and Explanation in Computational Notebooks. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–12.
  74. Visus: An Interactive System for Automatic Machine Learning Model Building and Curation. In Workshop on Human-In-the-Loop Data Analytics. Association for Computing Machinery, 1–7.
  75. Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization. In IEEE International Conference on Machine Learning and Applications, M. A. Wani (Ed.). IEEE, 42–49.
  76. Olivier Serrat. 2017. The Five Whys Technique. In Knowledge Solutions: Tools, Methods, and Approaches to Drive Organizational Performance. Springer Singapore, 307–310.
  77. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning. IEEE Transactions on Visualization and Computer Graphics 26, 1 (2019), 1064–1074.
  78. Student. 1908. The Probable Error of a Mean. Biometrika 6, 1 (1908), 1–25.
  79. ATM: A distributed, collaborative, scalable system for automated machine learning. In IEEE International Conference on Big Data, J.-Y. Nie, Z. Obradovic, T. Suzumura, R. Ghosh, R. Nambiar, C. Wang, H. Zang, R. A. Baeza-Yates, X. Hu, J. Kepner, A. Cuzzocrea, J. Tang, and M. Toyoda (Eds.). IEEE, 151–162.
  80. The NumPy Array: A structure for Efficient Numerical Computation. Computing in Science and Engineering 13, 2 (2011), 22–30.
  81. FLAML: A Fast and Lightweight AutoML Library. In Machine Learning and Systems, A. Smola, A. Dimakis, and I. Stoica (Eds.). mlsys.org, 434–447.
  82. AutoDS: Towards Human-Centered Automation of Data Science. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–12.
  83. How Much Automation Does a Data Scientist Want? arXiv preprint arXiv: 2101.03970 (2021), 1–31.
  84. AutoAI: Automating the End-to-End AI Lifecycle with Humans-in-the-Loop. In International Conference on Intelligent User Interfaces. Association for Computing Machinery, 77–78.
  85. Human-AI Collaboration in Data Science: Exploring Data Scientists’ Perceptions of Automated AI. Human-Computer Interaction 3, CSCW (2019), 1–24.
  86. ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning. In Conference on Human Factors in Computing Systems. Association for Computing Machinery, 1–12.
  87. Daniel Karl I. Weidele. 2019. Conditional Parallel Coordinates. In IEEE Visualization Conference, A. Endert, B. Fisher, and W. Willett (Eds.). IEEE, 221–225.
  88. AutoAIViz: Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates. In International Conference on Intelligent User Interfaces, F. Paternò, N. Oliver, C. Conati, L. D. Spano, and N. Tintarev (Eds.). Association for Computing Machinery, 308–312.
  89. Marc-André Zöller and Marco F. Huber. 2021. Benchmark and Survey of Automated Machine Learning Frameworks. Journal of Artificial Intelligence Research 70 (2021), 409–472.
  90. Incremental Search Space Construction for Machine Learning Pipeline Synthesis. In International Symposium on Intelligent Data Analysis, P. H. Abreu, P. P. Rodrigues, J. Gama, and A. Fernández (Eds.). Springer Cham, 103–115.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Marc-André Zöller (8 papers)
  2. Waldemar Titov (1 paper)
  3. Thomas Schlegel (1 paper)
  4. Marco F. Huber (47 papers)
Citations (8)