Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EXMOS: Explanatory Model Steering Through Multifaceted Explanations and Data Configurations (2402.00491v1)

Published 1 Feb 2024 in cs.AI and cs.HC

Abstract: Explanations in interactive machine-learning systems facilitate debugging and improving prediction models. However, the effectiveness of various global model-centric and data-centric explanations in aiding domain experts to detect and resolve potential data issues for model improvement remains unexplored. This research investigates the influence of data-centric and model-centric global explanations in systems that support healthcare experts in optimising models through automated and manual data configurations. We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts to explore the impact of different explanations on trust, understandability and model improvement. Our results reveal the insufficiency of global model-centric explanations for guiding users during data configuration. Although data-centric explanations enhanced understanding of post-configuration system changes, a hybrid fusion of both explanation types demonstrated the highest effectiveness. Based on our study results, we also present design implications for effective explanation-driven interactive machine-learning systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (88)
  1. Detection of data drift and outliers affecting machine learning model performance over time. arXiv:2012.09258 [stat.AP]
  2. Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access 6 (2018), 52138–52160.
  3. Power to the People: The Role of Humans in Interactive Machine Learning. AI Magazine 35, 4 (Dec. 2014), 105–120. https://doi.org/10.1609/aimag.v35i4.2513
  4. Examining Multiple Potential Models in End-User Interactive Concept Learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 1357–1360. https://doi.org/10.1145/1753326.1753531
  5. Ariful Islam Anik and Andrea Bunt. 2021. Data-Centric Explanations: Explaining Training Data of Machine Learning Systems to Promote Transparency. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–13. https://doi.org/10.1145/3411764.3445736
  6. On Selective, Mutable and Dialogic XAI: A Review of What Users Say about Different Types of Interactive Explanations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 411, 21 pages. https://doi.org/10.1145/3544548.3581314
  7. Aditya Bhattacharya. 2022. Applied Machine Learning Explainability Techniques. In Applied Machine Learning Explainability Techniques. Packt Publishing, Birmingham, UK. https://www.packtpub.com/product/applied-machine-learning-explainability-techniques/9781803246154
  8. Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations. In Proceedings of the 28th International Conference on Intelligent User Interfaces (Sydney, NSW, Australia) (IUI ’23). Association for Computing Machinery, New York, NY, USA, 204–219. https://doi.org/10.1145/3581641.3584075
  9. Contextualization and Exploration of Local Feature Importance Explanations to Improve Understanding and Satisfaction of Non-Expert Users. In 27th International Conference on Intelligent User Interfaces. ACM, Helsinki Finland, 807–819. https://doi.org/10.1145/3490099.3511139
  10. Virginia Braun and Victoria Clarke. 2012. Thematic Analysis. In APA Handbook of Research Methods in Psychology, Vol 2: Research Designs: Quantitative, Qualitative, Neuropsychological, and Biological. American Psychological Association, Washington, DC, US, 57–71. https://doi.org/10.1037/13620-004
  11. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]
  12. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108–122.
  13. Explainable Machine Learning in Credit Risk Management. Computational Economics 57 (01 2021). https://doi.org/10.1007/s10614-020-10042-0
  14. Kelly Caine. 2016. Local Standards for Sample Size at CHI. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 981–992. https://doi.org/10.1145/2858036.2858498
  15. Maya Cakmak and Andrea L Thomaz. 2011. Mixed-initiative active learning. ICML 2011 Workshop on Combining Learning Strategies to Reduce Label Cost (2011).
  16. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-Day Readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Sydney, NSW, Australia) (KDD ’15). Association for Computing Machinery, New York, NY, USA, 1721–1730. https://doi.org/10.1145/2783258.2788613
  17. Edward Y. Chang. 2023. Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings, and Future Directions. arXiv:2212.13591 [cs.AI]
  18. Interpretable Deep Models for ICU Outcome Prediction. AMIA Annual Symposium Proceedings 2016 (02 2017), 371–380.
  19. Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300789
  20. Explanatory and Actionable Debugging for Machine Learning: A TableQA Demonstration. In Proceedings of ACM SIGIR. 1333–1336.
  21. Visualization of Explanations of Incremental Models. Journal of Intelligent Computing 10 (12 2019), 121. https://doi.org/10.6025/jic/2019/10/4/121-127
  22. Explaining models. In Proceedings of the 24th International Conference on Intelligent User Interfaces. ACM. https://doi.org/10.1145/3301275.3302310
  23. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (Feb. 2017), 115–118.
  24. Gerald Fahner. 2018. Developing Transparent Credit Risk Scorecards More Effectively: An Explainable Artificial Intelligence Approach.
  25. Jerry Alan Fails and Dan R. Olsen. 2003. Interactive Machine Learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (Miami, Florida, USA) (IUI ’03). Association for Computing Machinery, New York, NY, USA, 39–45. https://doi.org/10.1145/604045.604056
  26. Fair AI. Business & information systems engineering 62, 4 (2020), 379–384.
  27. Human Model Evaluation in Interactive Supervised Learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machinery, New York, NY, USA, 147–156. https://doi.org/10.1145/1978942.1978965
  28. The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning. arXiv:2304.05366 [cs.LG]
  29. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 51, 5, Article 93 (aug 2018), 42 pages. https://doi.org/10.1145/3236009
  30. Building Trust in Interactive Machine Learning via User Contributed Interpretable Rules. 27th International Conference on Intelligent User Interfaces (2022). https://api.semanticscholar.org/CorpusID:247585155
  31. Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Human Mental Workload, Peter A. Hancock and Najmedin Meshkati (Eds.). Advances in Psychology, Vol. 52. North-Holland, 139–183. https://doi.org/10.1016/S0166-4115(08)62386-9
  32. Visual Attention Methods in Deep Learning: An In-Depth Survey. arXiv:2204.07756 [cs.CV]
  33. Alexander Hepburn and Raul Santos-Rodriguez. 2021. Explainers in the Wild: Making Surrogate Explainers Robust to Distortions through Perception. arXiv:2102.10951 [cs.CV]
  34. Metrics for Explainable AI: Challenges and Prospects. arXiv:1812.04608 [cs.AI]
  35. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–13. https://doi.org/10.1145/3290605.3300809
  36. Soliciting Human-in-the-Loop User Feedback for Interactive Machine Learning Reduces User Trust and Impressions of Model Accuracy. arXiv:2008.12735 [cs.HC]
  37. The Principles of Data-Centric AI (DCAI). arXiv:2211.14611 [cs.LG]
  38. AutoKeras: An AutoML Library for Deep Learning. Journal of Machine Learning Research 24, 6 (2023), 1–6. http://jmlr.org/papers/v24/20-1355.html
  39. Active Learning for Skewed Data Sets. arXiv:2005.11442 [cs.LG]
  40. ”Help Me Help the AI”: Understanding How Explainability Can Support Human-AI Interaction. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 250, 17 pages. https://doi.org/10.1145/3544548.3581001
  41. René F. Kizilcec. 2016. How Much Information? Effects of Transparency on Trust in an Algorithmic Interface. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 2390–2395. https://doi.org/10.1145/2858036.2858402
  42. W. Knox and Peter Stone. 2012. Reinforcement Learning from Human Reward: Discounting in Episodic Tasks. Proceedings - IEEE International Workshop on Robot and Human Interactive Communication. https://doi.org/10.1109/ROMAN.2012.6343862
  43. Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (2016). https://api.semanticscholar.org/CorpusID:14162250
  44. Principles of Explanatory Debugging to Personalize Interactive Machine Learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces. ACM, Atlanta Georgia USA, 126–137. https://doi.org/10.1145/2678025.2701399
  45. Explanatory Debugging: Supporting End-User Debugging of Machine-Learned Programs. In 2010 IEEE Symposium on Visual Languages and Human-Centric Computing. IEEE, Leganes, Madrid, Spain, 41–48. https://doi.org/10.1109/VLHCC.2010.15
  46. Too much, too little, or just right? Ways explanations impact end users’ mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. 3–10. https://doi.org/10.1109/VLHCC.2013.6645235
  47. Rethinking Explainability as a Dialogue: A Practitioner’s Perspective. arXiv:2202.01875 [cs.LG]
  48. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15. https://doi.org/10.1145/3313831.3376590 arXiv:2001.02478 [cs].
  49. Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI. arXiv:2206.10847 [cs.AI]
  50. Michael A. Lones. 2023. How to avoid machine learning pitfalls: a guide for academic researchers. arXiv:2108.02497 [cs.LG]
  51. Scott Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. arXiv:1705.07874 [cs.AI]
  52. DataPerf: Benchmarks for Data-Centric AI Development. arXiv:2207.10062 [cs.LG]
  53. Evie McCrum-Gardner. 2008. Which is the correct statistical test to use? British Journal of Oral and Maxillofacial Surgery 46, 1 (Jan. 2008), 38–41. https://doi.org/10.1016/j.bjoms.2007.09.002
  54. Tim Miller. 2017. Explanation in Artificial Intelligence: Insights from the Social Sciences. https://doi.org/10.48550/ARXIV.1706.07269
  55. Microsoft Azure Automated ML. 2023. . Microsoft. https://azure.microsoft.com/en-us/products/machine-learning/automatedml/#overview Accessed: 2023-11-12.
  56. Incorporating prior domain knowledge into deep neural networks. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 36–45.
  57. SHAP Python Package. 2023a. . Su-In Lee lab at the University of Washington, and Microsoft Research. https://github.com/shap/shap Accessed: 2023-05-12.
  58. Skope-Rules Python Package. 2023b. . scikit-learn-contrib. https://github.com/scikit-learn-contrib/skope-rules Accessed: 2023-06-10.
  59. A Nested Genetic Algorithm For Explaining Classification Data Sets With Decision Rules. (2022). https://doi.org/10.48550/arxiv.2209.07575
  60. Incorporating Explainable Artificial Intelligence (XAI) to aid the Understanding of Machine Learning in the Healthcare Domain.. In AICS. 169–180.
  61. Machine Guides, Human Supervises: Interactive Learning with Global Explanations. ArXiv abs/2009.09723 (2020). https://api.semanticscholar.org/CorpusID:221819494
  62. PyCaret. 2023. . PyCaret. https://pycaret.org/ Accessed: 2023-11-12.
  63. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. arXiv:1602.04938 [cs.LG]
  64. Vis4ml: An ontology for visual analytics assisted machine learning. IEEE transactions on visualization and computer graphics 25, 1 (2018), 385–395.
  65. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nature Machine Intelligence 2 (08 2020), 476–486. https://doi.org/10.1038/s42256-020-0212-3
  66. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision 128, 2 (oct 2019), 336–359. https://doi.org/10.1007/s11263-019-01228-7
  67. Burr Settles. 2009. Active Learning Literature Survey. https://api.semanticscholar.org/CorpusID:324600
  68. Burr Settles. 2011. Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 1467–1478.
  69. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv:1312.6034 [cs.CV]
  70. Explaining machine learning models with interactive natural language conversations using TalkToModel. Nature Machine Intelligence (27 Jul 2023). https://doi.org/10.1038/s42256-023-00692-8
  71. Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus. In Proceedings of the Annual Symposium on Computer Application in Medical Care. 261–265.
  72. Eduardo Soares and Plamen Angelov. 2019. Fair-by-design explainable models for prediction of recidivism. arXiv:1910.02043 [stat.ML]
  73. explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE trans. on visualization and computer graphics 26, 1 (2019), 1064–1074.
  74. Interacting meaningfully with machine learning systems: Three experiments. Int. Journal of Human-Computer Studies 67, 8 (2009), 639–662.
  75. EnsembleMatrix: Interactive Visualization to Support Machine Learning with Multiple Classifiers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 1283–1292. https://doi.org/10.1145/1518701.1518895
  76. Leveraging Explanations in Interactive Machine Learning: An Overview. http://arxiv.org/abs/2207.14526 arXiv:2207.14526 [cs].
  77. Stefano Teso and Kristian Kersting. 2019. Explanatory Interactive Machine Learning. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 239–245. https://doi.org/10.1145/3306618.3314293
  78. Agents Vs. Users: Visual Recommendation of Research Talks with Multiple Dimension of Relevance. ACM Trans. Interact. Intell. Syst. 6, 2, Article 11 (jul 2016), 42 pages. https://doi.org/10.1145/2946794
  79. In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction. arXiv:2005.04176 [stat.ML]
  80. Designing Theory-Driven User-Centric Explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–15. https://doi.org/10.1145/3290605.3300831
  81. DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 740, 20 pages. https://doi.org/10.1145/3544548.3580852
  82. Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 4132–4142. https://doi.org/10.1145/3534678.3539074
  83. Steven Euijong Whang and Jae-Gil Lee. 2020. Data Collection and Quality Challenges for Deep Learning. Proc. VLDB Endow. 13, 12 (aug 2020), 3429–3432. https://doi.org/10.14778/3415478.3415562
  84. D.H. Wolpert and W.G. Macready. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67–82. https://doi.org/10.1109/4235.585893
  85. Qualtrics XM. 2023. . Qualtrics. https://www.qualtrics.com/ Accessed: 2023-06-15.
  86. A Study on Interaction in Human-in-the-Loop Machine Learning for Text Analytics.. In IUI Workshops.
  87. Interpretable Classification Models for Recidivism Prediction. Journal of the Royal Statistical Society Series A: Statistics in Society 180, 3 (09 2016), 689–722. https://doi.org/10.1111/rssa.12227 arXiv:https://academic.oup.com/jrsssa/article-pdf/180/3/689/49430770/jrsssa_180_3_689.pdf
  88. Data-centric Artificial Intelligence: A Survey. arXiv:2303.10158 [cs.LG]
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Aditya Bhattacharya (12 papers)
  2. Simone Stumpf (16 papers)
  3. Lucija Gosak (5 papers)
  4. Gregor Stiglic (22 papers)
  5. Katrien Verbert (19 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com