Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making (2001.02114v1)

Published 7 Jan 2020 in cs.AI and cs.HC

Abstract: Today, AI is being increasingly used to help human experts make decisions in high-stakes scenarios. In these scenarios, full automation is often undesirable, not only due to the significance of the outcome, but also because human experts can draw on their domain knowledge complementary to the model's to ensure task success. We refer to these scenarios as AI-assisted decision making, where the individual strengths of the human and the AI come together to optimize the joint decision outcome. A key to their success is to appropriately \textit{calibrate} human trust in the AI on a case-by-case basis; knowing when to trust or distrust the AI allows the human expert to appropriately apply their knowledge, improving decision outcomes in cases where the model is likely to perform poorly. This research conducts a case study of AI-assisted decision making in which humans and AI have comparable performance alone, and explores whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. Specifically, we study the effect of showing confidence score and local explanation for a particular prediction. Through two human experiments, we show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors. We also highlight the problems in using local explanation for AI-assisted decision making scenarios and invite the research community to explore new approaches to explainability for calibrating human trust in AI.

PDF Abstract

AI-Assisted Decision Making: Confidence and Explanation Effects on Trust Calibration

The paper "Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making" explores crucial aspects of human-AI collaborative decision-making. The authors, Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy, investigate how AI confidence scores and local explanations influence trust calibration and the resulting decision accuracy when humans and AI systems collaborate.

Context and Motivation

AI technologies are increasingly implemented in critical decision-making contexts, enhancing human capabilities. However, solely relying on AI is often not feasible due to potential model errors and biases in training data. The interaction of human judgment with AI recommendations aims to capitalize on complementary strengths, optimizing decision accuracy. Thus, calibrating trust—determining when to rely on AI—becomes vital.

Study Design and Hypotheses

The paper explores using confidence scores and local explanations to enhance trust calibration. Two experiments are detailed:

Experiment 1 investigates whether AI confidence scores can help users appropriately calibrate their trust in AI recommendations.
Experiment 2 analyzes the impact of local explanations on trust calibration.

The analysis encompasses scenarios where AI predictions are presented directly to users and those requiring users to blindly delegate decisions to AI.

Results and Insights

Experiment 1 effectively demonstrates that displaying confidence scores aids trust calibration. Users adjusted their trust level based on AI confidence, particularly increasing their reliance on AI for high-confidence predictions. Interestingly, this improved trust calibration did not translate into better overall decision accuracy. A potential reason is the alignment of error boundaries between humans and AI; the challenging cases for AI were equally difficult for humans in this paper.

Experiment 2 finds local explanations ineffective in improving trust calibration or decision accuracy. Contrary to some prior findings, the local explanations did not significantly enhance users' ability to discern when to trust AI predictions. This highlights limitations in the context of real-time decision-making, where explanation overload or lack of clarity may hinder trust calibration.

Implications and Future Directions

The results underscore the complexity of fostering effective human-AI collaboration. Trust calibration improves when confidence scores are used strategically. However, the alignment of human and AI error boundaries is critical—suggesting that AI-assisted systems could benefit from contextual awareness or supplementary domain knowledge.

The research also raises questions about the role of explainability. While explanations are touted for enhancing trust, they may not adequately address calibration needs unless designed to convey uncertainty intelligibly. The paper advocates for further exploration of explanation techniques that enhance users' understanding of decision boundaries and error zones.

Conclusion

The findings of this paper reveal the nuanced dynamics between AI assistance and human trust. By identifying the conditions under which confidence scores and explanations function effectively, it opens new avenues for developing more intuitive and reliable AI-assisted decision-making systems. Future research should focus on designing explanations that align with users' cognitive processes and exploring methods to dynamically assess and adjust the alignment of human and AI capabilities in diverse decision-making environments.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Yunfeng Zhang (45 papers)
Q. Vera Liao (49 papers)
Rachel K. E. Bellamy (9 papers)

Citations (586)

View on Semantic Scholar