Does More Advice Help? The Effects of Second Opinions in AI-Assisted Decision Making (2401.07058v1)
Abstract: AI assistance in decision-making has become popular, yet people's inappropriate reliance on AI often leads to unsatisfactory human-AI collaboration performance. In this paper, through three pre-registered, randomized human subject experiments, we explore whether and how the provision of {second opinions} may affect decision-makers' behavior and performance in AI-assisted decision-making. We find that if both the AI model's decision recommendation and a second opinion are always presented together, decision-makers reduce their over-reliance on AI while increase their under-reliance on AI, regardless whether the second opinion is generated by a peer or another AI model. However, if decision-makers have the control to decide when to solicit a peer's second opinion, we find that their active solicitations of second opinions have the potential to mitigate over-reliance on AI without inducing increased under-reliance in some cases. We conclude by discussing the implications of our findings for promoting effective human-AI collaborations in decision-making.
- AI-Assisted Human Labeling: Batching for Efficiency without Overreliance. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–27.
- Peter C Austin. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research 46, 3 (2011), 399–424.
- Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2–11.
- Updates in human-ai teams: Understanding and addressing the performance/compatibility tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 2429–2437.
- Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
- André Betzer and Jan Philipp Harries. 2022. How online discussion board activity affects stock trading: the case of GameStop. Financial markets and portfolio management (2022), 1–30.
- Silvia Bonaccio and Reeshad S Dalal. 2006. Advice taking and decision-making: An integrative literature review, and implications for the organizational sciences. Organizational behavior and human decision processes 101, 2 (2006), 127–151.
- To trust or to think: cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–21.
- David V Budescu and Adrian K Rantilla. 2000. Confidence in aggregation of expert opinions. Acta psychologica 104, 3 (2000), 371–398.
- The role of explanations on trust and reliance in clinical decision support systems. In 2015 international conference on healthcare informatics. IEEE, 160–169.
- Feature-Based Explanations Don’t Help People Detect Misclassifications of Online Toxicity. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 95–106.
- Understanding the role of human intuition on reliance in human-AI decision-making with explanations. Proceedings of the ACM on Human-Computer Interaction 7, CSCW2 (2023), 1–32.
- Explaining decision-making algorithms through UI: Strategies to help non-expert stakeholders. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–12.
- Chun-Wei Chiang and Ming Yin. 2021. You’d better stop! Understanding human reliance on machine learning models under covariate shift. In 13th ACM Web Science Conference 2021. 120–129.
- Chun-Wei Chiang and Ming Yin. 2022. Exploring the Effects of Machine Learning Literacy Interventions on Laypeople’s Reliance on Machine Learning Models. In 27th International Conference on Intelligent User Interfaces. 148–161.
- Are visual explanations useful? a case study in model-in-the-loop prediction. arXiv preprint arXiv:2007.12248 (2020).
- Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics 10 (2022), 92–110.
- Doubting AI Predictions: Influence-Driven Second Opinion Recommendation. arXiv preprint arXiv:2205.00072 (2022).
- AI Assisted Data Labeling with Interactive Auto Label. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 13161–13163.
- Increasing the speed and accuracy of data labeling through an ai assisted interface. In 26th International Conference on Intelligent User Interfaces. 392–401.
- Algorithm aversion: people erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114.
- James N Druckman. 2001. Using credible advice to overcome framing effects. Journal of Law, Economics, and Organization 17, 1 (2001), 62–82.
- Human-Centered Explainable AI (HCXAI): beyond opening the black-box of AI. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–7.
- For what it’s worth: Humans overwrite their economic self-interest to avoid bargaining with AI systems. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–18.
- Impact of algorithmic decision making on human behavior: Evidence from ultimatum bargaining. In Proceedings of the AAAI conference on human computation and crowdsourcing, Vol. 8. 43–52.
- Sentiment analysis system for Indonesia online retail shop review using hierarchy Naive Bayes technique. In 2016 4th international conference on information and communication technology (ICoICT). IEEE, 1–6.
- Raymond Fok and Daniel S Weld. 2023. In Search of Verifiability: Explanations Rarely Enable Complementary Performance in AI-Advised Decision Making. arXiv preprint arXiv:2305.07722 (2023).
- Shane Frederick. 2005. Cognitive reflection and decision making. Journal of Economic perspectives 19, 4 (2005), 25–42.
- Francesca Gino and Maurice E Schweitzer. 2008. Blinded by anger or feeling the love: how emotions influence advice taking. Journal of Applied Psychology 93, 5 (2008), 1165.
- Jury learning: Integrating dissenting voices into machine learning models. In CHI Conference on Human Factors in Computing Systems. 1–19.
- Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–24.
- Nigel Harvey and Ilan Fischer. 1997. Taking advice: Accepting help, improving judgment, and sharing responsibility. Organizational behavior and human decision processes 70, 2 (1997), 117–133.
- Using advice and assessing its quality. Organizational behavior and human decision processes 81, 2 (2000), 252–273.
- Peter Hase and Mohit Bansal. 2020. Evaluating explainable AI: Which algorithmic explanations help users predict model behavior? arXiv preprint arXiv:2005.01831 (2020).
- Knowing About Knowing: An Illusion of Human Competence Can Hinder Appropriate Reliance on AI Systems. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18.
- Human-AI Complementarity in Hybrid Intelligence Systems: A Structured Literature Review. PACIS (2021), 78.
- Samuel Himmelfarb. 1975. What do you do when the control group doesn’t fit into the factorial design? Psychological Bulletin 82, 3 (1975), 363.
- Yoyo Tsung-Yu Hou and Malte F Jung. 2021. Who is the expert? Reconciling algorithm aversion and algorithm appreciation in AI-supported decision making. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 1–25.
- Siu Cheung Hui and G Jha. 2000. Data mining for customer service support. Information & Management 38, 1 (2000), 1–13.
- Mandy Hütter and Fabian Ache. 2016. Seeking advice: A sampling approach to advice taking. Judgment and Decision Making 11, 4 (2016), 401.
- Exploring the Use of Personalized AI for Identifying Misinformation on Social Media. CHI’23 (2023).
- Averaging probability judgments: Monte Carlo analyses of asymptotic diagnostic value. Journal of Behavioral Decision Making 14, 2 (2001), 123–140.
- ” Because AI is 100% right and safe”: User Attitudes and Sources of AI Authority in India. In CHI Conference on Human Factors in Computing Systems. 1–18.
- Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.
- Pahulpreet Singh Kohli and Shriya Arora. 2018. Application of machine learning in disease prediction. In 2018 4th International conference on computing communication and automation (ICCCA). IEEE, 1–4.
- Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal 13 (2015), 8–17.
- Machine learning techniques and data for stock market forecasting: a literature review. Expert Systems with Applications (2022), 116659.
- Towards a science of human-ai decision making: a survey of empirical studies. arXiv preprint arXiv:2112.11471 (2021).
- ” Why is’ Chicago’deceptive?” Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
- Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 29–38.
- Modeling Human Trust and Reliance in AI-Assisted Decision Making: A Markovian Approach. (2023).
- Decoding AI’s Nudge: A Unified Framework to Predict Human Behavior in AI-assisted Decision Making. arXiv:2401.05840 [cs.HC]
- Understanding the effect of out-of-distribution examples and interactive explanations on human-ai decision making. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 1–45.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
- Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151 (2019), 90–103.
- “I just like the stock”: The role of Reddit sentiment in the GameStop share rally. Financial Review (2022).
- Zhuoran Lu and Ming Yin. 2021. Human Reliance on Machine Learning Models When Performance Feedback is Limited: Heuristics and Risks. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
- Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–19.
- Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, 142–150. http://www.aclweb.org/anthology/P11-1015
- John M McGuirl and Nadine B Sarter. 2006. Supporting trust calibration and the effective use of decision aids by presenting dynamic system confidence information. Human factors 48, 4 (2006), 656–665.
- Evaluation of sentiment analysis in finance: from lexicons to transformers. IEEE access 8 (2020), 131662–131682.
- Robert Monarch and Robert Munro. 2021. Human-in-the-Loop Machine Learning: Active learning and annotation for human-centered AI. Simon and Schuster.
- Human-in-the-loop machine learning: A state of the art. Artificial Intelligence Review 56, 4 (2023), 3005–3054.
- Deep learning methods in transportation domain: a review. IET Intelligent Transport Systems 12, 9 (2018), 998–1004.
- Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. Journal of clinical epidemiology 54, 4 (2001), 387–398.
- Investigating the importance of first impressions and explainable ai with interactive video analysis. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.
- The role of domain expertise in user trust and the impact of first impressions with intelligent systems. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8. 112–121.
- Anchoring Bias Affects Mental Model Formation and User Reliance in Explainable AI Systems. In 26th International Conference on Intelligent User Interfaces. 340–350.
- Kazuo Okamura and Seiji Yamada. 2020. Adaptive trust calibration for human-AI collaboration. Plos one 15, 2 (2020), e0229132.
- A slow algorithm improves users’ assessments of the algorithm’s accuracy. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–15.
- Samir Passi and Mihaela Vorvoreanu. 2022. Overreliance on AI: Literature review. (2022).
- Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert systems with applications 42, 1 (2015), 259–268.
- Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–52.
- Amy Rechkemmer and Ming Yin. 2022. When Confidence Meets Accuracy: Exploring the Effects of Multiple Performance Indicators on Trust in Machine Learning Models. In CHI Conference on Human Factors in Computing Systems. 1–14.
- John Rohrbaugh. 1979. Improving the quality of group judgment: Social judgment analysis and the Delphi technique. Organizational Behavior and Human Performance 24, 1 (1979), 73–92.
- David L Ronis and J Frank Yates. 1987. Components of probability judgment accuracy: Individual consistency and effects of subject matter and assessment method. Organizational Behavior and Human Decision Processes 40, 2 (1987), 193–218.
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1 (1983), 41–55.
- Sahar F Sabbeh. 2018. Machine-learning techniques for customer retention: A comparative study. International Journal of Advanced Computer Science and Applications 9, 2 (2018).
- I can do better than your AI: expertise and explanations. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 240–251.
- Should I Follow AI-based Advice? Measuring Appropriate Reliance in Human-AI Decision-Making. arXiv preprint arXiv:2204.06916 (2022).
- A Meta-Analysis on the Utility of Explainable Artificial Intelligence in Human-AI Decision-Making. arXiv preprint arXiv:2205.05126 (2022).
- Appropriate reliance on AI advice: Conceptualization and the effect of explanations. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 410–422.
- Philipp Schmidt and Felix Biessmann. 2019. Quantifying interpretability and trust in machine learning systems. arXiv preprint arXiv:1901.08558 (2019).
- On the Interdependence of Reliance Behavior and Accuracy in AI-Assisted Decision-Making. arXiv preprint arXiv:2304.08804 (2023).
- Andrew Schotter. 2003. Decision making with naive advice. American Economic Review 93, 2 (2003), 196–201.
- On the inability to ignore useless advice: A case for anchoring in the judge-advisor-system. Experimental Psychology 64, 3 (2017), 170.
- Deming Sheng and Jingling Yuan. 2021. An efficient long Chinese text sentiment analysis method using BERT-based models with BiGRU. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 192–197.
- Janet A Sniezek. 1989. An examination of group process in judgmental forecasting. International Journal of Forecasting 5, 2 (1989), 171–178.
- Janet A Sniezek and Timothy Buckley. 1995. Cueing and cognitive conflict in judge-advisor decision making. Organizational behavior and human decision processes 62, 2 (1995), 159–174.
- Jack B Soll. 1999. Intuitive theories of information: Beliefs about the value of redundancy. Cognitive Psychology 38, 2 (1999), 317–346.
- Second chance for a first impression? Trust development in intelligent system interaction. In Proceedings of the 29th ACM Conference on user modeling, adaptation and personalization. 77–87.
- The Cognitive Reflection Test as a predictor of performance on heuristics-and-biases tasks. Memory & cognition 39, 7 (2011), 1275–1289.
- Jennifer Wortman Vaughan and Hanna Wallach. 2020. A human-centered agenda for intelligible machine learning. Machines We Trust: Getting Along with Artificial Intelligence (2020).
- Stock closing price prediction using machine learning techniques. Procedia computer science 167 (2020), 599–606.
- Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks. ACM Transactions on Computer-Human Interaction 29, 2 (2022), 1–33.
- From human-human collaboration to Human-AI collaboration: Designing AI systems that can work together with people. In Extended abstracts of the 2020 CHI conference on human factors in computing systems. 1–6.
- Human-ai collaboration in data science: Exploring data scientists’ perceptions of automated ai. Proceedings of the ACM on human-computer interaction 3, CSCW (2019), 1–24.
- Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–15.
- Human-Centered Design and Evaluation of AI-Empowered Clinical Decision Support Systems: A Systematic Review. Frontiers in Computer Science 5 (2023), 57.
- Will You Accept the AI Recommendation? Predicting Human Behavior in AI-Assisted Decision Making. In Proceedings of the ACM Web Conference 2022. 1697–1708.
- Xinru Wang and Ming Yin. 2021. Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In 26th International Conference on Intelligent User Interfaces. 318–328.
- Measuring and Understanding Trust Calibrations for Automated Systems: A Survey of the State-Of-The-Art and Future Directions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–16.
- A survey of human-in-the-loop for machine learning. Future Generation Computer Systems 135 (2022), 364–381.
- A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction. NAACL’22 (2022).
- How do visual explanations foster end users’ appropriate trust in machine learning?. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 189–201.
- Ilan Yaniv. 2004. The benefit of additional opinions. Current directions in psychological science 13, 2 (2004), 75–78.
- Ilan Yaniv and Eli Kleinberger. 2000. Advice taking in decision making: Egocentric discounting and reputation formation. Organizational behavior and human decision processes 83, 2 (2000), 260–281.
- Ilan Yaniv and Maxim Milyavsky. 2007. Using advice from multiple sources to revise and improve judgments. Organizational Behavior and Human Decision Processes 103, 1 (2007), 104–120.
- Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture. In Findings of the Association for Computational Linguistics: EMNLP 2023. 11629–11643.
- Making sense of recommendations. Journal of Behavioral Decision Making 32, 4 (2019), 403–414.
- Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–12.
- Do i trust my machine teammate? an investigation from perception to decision. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 460–468.
- Long-Text Sentiment Analysis Based on Semantic Graph. In 2020 IEEE International Conference on Embedded Software and Systems (ICESS). IEEE, 1–6.
- Rethinking human-ai collaboration in complex medical decision making: A case study in sepsis diagnosis. arXiv preprint arXiv:2309.12368 (2023).
- Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 295–305.
- StoryBuddy: A Human-AI Collaborative Chatbot for Parent-Child Interactive Storytelling with Flexible Parental Involvement. CHI’2022 (2022).
- Sentiment analysis in health and well-being: systematic review. JMIR medical informatics 8, 1 (2020), e16023.