Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning (2403.05911v2)

Published 9 Mar 2024 in cs.HC and cs.AI

Abstract: Imagine if AI decision-support tools not only complemented our ability to make accurate decisions, but also improved our skills, boosted collaboration, and elevated the joy we derive from our tasks. Despite the potential to optimize a broad spectrum of such human-centric objectives, the design of current AI tools remains focused on decision accuracy alone. We propose offline reinforcement learning (RL) as a general approach for modeling human-AI decision-making to optimize human-AI interaction for diverse objectives. RL can optimize such objectives by tailoring decision support, providing the right type of assistance to the right person at the right time. We instantiated our approach with two objectives: human-AI accuracy on the decision-making task and human learning about the task and learned decision support policies from previous human-AI interaction data. We compared the optimized policies against several baselines in AI-assisted decision-making. Across two experiments (N=316 and N=964), our results demonstrated that people interacting with policies optimized for accuracy achieve significantly better accuracy -- and even human-AI complementarity -- compared to those interacting with any other type of AI support. Our results further indicated that human learning was more difficult to optimize than accuracy, with participants who interacted with learning-optimized policies showing significant learning improvement only at times. Our research (1) demonstrates offline RL to be a promising approach to model human-AI decision-making, leading to policies that may optimize human-centric objectives and provide novel insights about the AI-assisted decision-making space, and (2) emphasizes the importance of considering human-centric objectives beyond decision accuracy in AI-assisted decision-making, opening up the novel research challenge of optimizing human-AI interaction for such objectives.

Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning

The paper "Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning" addresses a significant gap in the current design paradigm of AI decision support systems by emphasizing the need to optimize human-centric objectives beyond mere decision accuracy. This work introduces offline reinforcement learning (RL) as a viable, customizable approach to model human-AI decision-making processes in a way that can adapt to various human-centric objectives and contextual factors.

Methodological Approach

The authors adopted a structured approach to instantiate their proposed method. Targeting both immediate decision accuracy and human learning as the critical objectives, they employed a Markov Decision Process (MDP) framework. The state space was defined to encompass individual differences in need for cognition (NFC), along with relevant contextual factors such as the AI's uncertainty and the decision-maker's task knowledge. The action space included various forms of AI assistance: no assistance, explanation only, recommendation and explanation (SXAI), and on-demand assistance. The reward structure differentiated between immediate accuracy (dense reward) and learning (sparse reward).

The experimental methodology included a data collection paper leveraging an exploratory policy and subsequent offline learning of optimal policies through Q-learning. This choice enabled deriving decision-support policies without real-time interaction risks, particularly beneficial for sensitive applications like clinical settings.

Key Findings

Computational Insights

The computational evaluation demonstrated that the RL-based policies for optimizing accuracy and learning differed significantly from the fixed SXAI policy. Optimal policies for learning, specifically, favored interactions known to induce cognitive engagement, notably for individuals low in NFC. This insight aligns with the hypothesis that people less inclined toward cognitive effort can benefit from specially designed interventions that promote engagement.

The effectiveness of the RL policies was validated through two user studies, reinforcing the notion that adaptive, context-aware AI support can lead to superior outcomes compared to static assistance models. Notably, individuals interacting with the accuracy-optimized policy achieved significantly better decision accuracy than those using baseline policies, confirming the strength of the RL approach for this objective. Additionally, policies optimized for learning showed mixed results, indicating that while the approach holds promise, the complexity of designing interactions to foster learning requires further investigation.

Objective vs. Subjective Experience

Interestingly, the paper found no inherent trade-off between learning and subjective task enjoyment, particularly for individuals low in NFC, where cognitive engagement positively correlated with task enjoyment and perceived learning. This challenges previous assumptions that enhanced cognitive engagement might reduce subjective satisfaction, highlighting that well-designed interaction models can simultaneously enhance user experience and achieve pedagogical objectives.

Implications and Future Directions

The paper's contributions provide a robust foundation for future research aimed at refining AI decision support systems to better serve human-centric goals. The use of offline RL to model human-AI decision dynamics introduces a powerful toolkit for developing interaction policies that could adaptively enhance both operational performance and user satisfaction.

The findings underscore the necessity of extending the research to explore other human-centric objectives beyond accuracy and learning, such as promoting long-term user engagement or improving collaborative efficiency in team settings. Furthermore, there is a clear need for developing and empirically validating new forms of AI explanations and interactions that can reliably enhance learning across diverse user populations.

Finally, while this research focused on a non-critical domain (exercise prescription for laypeople), the methodology and results have broader applicability. Extending this work to high-stakes environments, such as healthcare, can offer substantial benefits. Embedding RL-based adaptive support into clinical decision support systems holds the potential to significantly improve patient outcomes by optimizing both decision accuracy and clinicians' learning, ultimately fostering a more effective and expert workforce.

Conclusion

This paper makes a substantial contribution to the field of AI-assisted decision-making by showcasing the potential of offline RL to achieve human-centric objectives. The findings advocate for a nuanced, dynamic approach to AI support that considers individual differences and specific context factors, paving the way for more effective and engaging human-AI collaborations. As AI continues to integrate into various aspects of decision-making, this research provides valuable insights and a practical framework for optimizing human-centric outcomes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Can we Have Pro-Worker AI? CEPR Policy Insight 123 (October 2023).
  2. 2011 Compendium of Physical Activities: a second update of codes and MET values. Med Sci Sports Exerc 43, 8 (2011), 1575–1581.
  3. Does explainable artificial intelligence improve human decision-making?. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6618–6626.
  4. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of CHI ’21.
  5. Lucy M Berlin and Robin Jeffries. 1992. Consultants and apprentices: observations about learning and collaborative problem solving. In Proceedings of the 1992 ACM conference on Computer-supported cooperative work. 130–137.
  6. Learning Personalized Decision Support Policies. arXiv preprint arXiv:2304.06701 (2023).
  7. Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems. In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI ’20). ACM, New York, NY, USA.
  8. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-Assisted Decision-Making. Proc. ACM Hum.-Comput. Interact. 5, CSCW1, Article 188 (April 2021), 21 pages. DOI:http://dx.doi.org/10.1145/3449287 
  9. John T Cacioppo and Richard E Petty. 1982a. The need for cognition. Journal of personality and social psychology 42, 1 (1982), 116.
  10. John T. Cacioppo and Richard E. Petty. 1982b. The need for cognition. Journal of Personality and Social Psychology 42, 1 (1982), 116–131. DOI:http://dx.doi.org/10.1037/0022-3514.42.1.116 
  11. How Time Pressure in Different Phases of Decision-Making Influences Human-AI Collaboration. Proc. ACM Hum.-Comput. Interact. 7, CSCW2, Article 277 (oct 2023), 26 pages. DOI:http://dx.doi.org/10.1145/3610068 
  12. Giuseppe Carenini. 2001. An Analysis of the Influence of Need for Cognition on Dynamic Queries Usage. In CHI ’01 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’01). ACM, New York, NY, USA, 383–384. DOI:http://dx.doi.org/10.1145/634067.634293 
  13. Ana-Maria Cazan and Simona Elena Indreica. 2014. Need for cognition and approaches to learning among university students. Procedia-Social and Behavioral Sciences 127 (2014), 134–138.
  14. Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations. arXiv e-prints (2023), arXiv–2301.
  15. Lingwei Cheng and Alexandra Chouldechova. 2023. Overcoming Algorithm Aversion: A Comparison between Process and Outcome Control. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–27.
  16. William M Crocoll and Bruce G Coury. 1990. Status or recommendation: Selecting the type of information for decision aiding. In Proceedings of the human factors society annual meeting, Vol. 34. SAGE Publications Sage CA: Los Angeles, CA, 1524–1528.
  17. Don’t Just Tell Me, Ask Me: AI Systems that Intelligently Frame Explanations as Questions Improve Human Logical Discernment Accuracy over Causal AI explanations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–13.
  18. Self-determination theory in work organizations: The state of a science. Annual review of organizational psychology and organizational behavior 4 (2017), 19–43.
  19. Edward L Deci and Richard M Ryan. 2012. Self-determination theory. Handbook of theories of social psychology 1, 20 (2012), 416–436.
  20. Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences 116, 39 (2019), 19251–19257. DOI:http://dx.doi.org/10.1073/pnas.1821936116 
  21. Where’s the reward? a review of reinforcement learning for instructional sequencing. International Journal of Artificial Intelligence in Education 29 (2019), 568–620.
  22. Matthew Fisher and Daniel M Oppenheimer. 2021a. Harder than you think: How outside assistance leads to overconfidence. Psychological Science 32, 4 (2021), 598–610.
  23. Matthew Fisher and Daniel M Oppenheimer. 2021b. Who knows what? Knowledge misattribution in the division of cognitive labor. Journal of Experimental Psychology: Applied 27, 2 (2021), 292.
  24. Krzysztof Z Gajos and Krysta Chauncey. 2017. The influence of personality traits and cognitive load on the use of adaptive user interfaces. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. 301–306.
  25. Krzysztof Z. Gajos and Lena Mamykina. 2022. Do People Engage Cognitively with AI? Impact of AI Assistance on Incidental Learning. In 27th International Conference on Intelligent User Interfaces (IUI ’22). Association for Computing Machinery, New York, NY, USA, 794–806. DOI:http://dx.doi.org/10.1145/3490099.3511138 
  26. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ digital medicine 4, 1 (2021), 1–8.
  27. Ground(less) Truth: A Causal Framework for Proxy Labels in Human-Algorithm Decision-Making. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’23). Association for Computing Machinery, New York, NY, USA, 688–704. DOI:http://dx.doi.org/10.1145/3593013.3594036 
  28. Knowing About Knowing: An Illusion of Human Competence Can Hinder Appropriate Reliance on AI Systems. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18.
  29. Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
  30. Training towards critical use: Learning to situate ai predictions relative to human knowledge. In Proceedings of The ACM Collective Intelligence Conference. 63–78.
  31. Michael G Kenward and James H Roger. 1997. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics (1997), 983–997.
  32. An astonishing regularity in student learning rate. Proceedings of the National Academy of Sciences 120, 13 (2023), e2221311120.
  33. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643 (2020).
  34. Sample size calculations for micro-randomized trials in mHealth. Statistics in medicine 35, 12 (2016), 1944–1971.
  35. The effects of online reviews on purchasing intention: The moderating role of need for cognition. Social Behavior and Personality: an international journal 39, 1 (2011), 71–81.
  36. Does more advice help? the effects of second opinions in AI-assisted decision making. arXiv preprint arXiv:2401.07058 (2024).
  37. Zhuoran Lu and Ming Yin. 2021. Human reliance on machine learning models when performance feedback is limited: Heuristics and risks. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.
  38. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–19.
  39. Victoria J. Marsick and Karen E. Watkins. 2001. Informal and Incidental Learning. New Directions for Adult and Continuing Education 2001, 89 (2001), 25. DOI:http://dx.doi.org/10.1002/ace.5 
  40. Rethinking informal and incidental learning in terms of complexity and the social context. Journal of Adult Learning, Knowledge and Innovation 1, 1 (2017), 27–34.
  41. Tim Miller. 2023. Explainable AI is Dead, Long Live Explainable AI! Hypothesis-Driven Decision Support Using Evaluative AI. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’23). Association for Computing Machinery, New York, NY, USA, 333–342. DOI:http://dx.doi.org/10.1145/3593013.3594001 
  42. Effective Human-AI Teams via Learned Natural Language Rules and Onboarding. Advances in Neural Information Processing Systems 36 (2024).
  43. Teaching humans when to defer to a classifier via exemplars. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5323–5331.
  44. Gali Noti and Yiling Chen. 2022. Learning When to Advise Human Decision Makers. arXiv preprint arXiv:2209.13578 (2022).
  45. A Slow Algorithm Improves Users’ Assessments of the Algorithm’s Accuracy. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 102 (Nov. 2019), 15 pages. DOI:http://dx.doi.org/10.1145/3359204 
  46. Samir Passi and Mihaela Vorvoreanu. 2022. Overreliance on AI Literature Review. Microsoft Research (2022).
  47. AI Knowledge: Improving AI Delegation through Human Enablement. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
  48. Manipulating and measuring model interpretability. arXiv preprint arXiv:1802.07810 (2018).
  49. Amy Rechkemmer and Ming Yin. 2022. When confidence meets accuracy: Exploring the effects of multiple performance indicators on trust in machine learning models. In Proceedings of the 2022 chi conference on human factors in computing systems. 1–14.
  50. Katharina Reinecke and Krzysztof Z. Gajos. 2015. LabintheWild: Conducting Large-Scale Online Experiments With Uncompensated Samples. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW ’15). ACM, New York, NY, USA, 1364–1378. DOI:http://dx.doi.org/10.1145/2675133.2675246 
  51. Jerome I Rotgans and Henk G Schmidt. 2011. Cognitive engagement in the problem-based learning classroom. Advances in health sciences education 16, 4 (2011), 465–479.
  52. I can do better than your AI: expertise and explanations. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 240–251.
  53. Effects of interactivity in a web site: The moderating effect of need for cognition. Journal of advertising 34, 3 (2005), 31–44.
  54. How AI fails us. arXiv preprint arXiv:2201.04200 (2021).
  55. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction.
  56. Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure. In 29th International Conference on Intelligent User Interfaces (IUI ’24). ACM.
  57. Tracy L Tuten and Michael Bosnjak. 2001. Understanding differences in web usage: The role of need for cognition and the five factor model of personality. Social Behavior and Personality: an international journal 29, 4 (2001), 391–398.
  58. Explanations can reduce overreliance on ai systems during decision-making. Proceedings of the ACM on Human-Computer Interaction 7, CSCW1 (2023), 1–38.
  59. Construction of smoking-relevant risk perceptions among college students: The influence of need for cognition and message content. Journal of Applied Social Psychology 37, 1 (2007), 91–114. DOI:http://dx.doi.org/10.1111/j.0021-9029.2007.00149.x 
  60. Christopher John Cornish Hellaby Watkins. 1989. Learning from delayed rewards. (1989).
  61. Matching Health Messages to Information-Processing Styles : Need for Cognition and Mammography Utilization. Health Communication 15, 4 (2003), 375–392.
  62. Harnessing biomedical literature to calibrate clinicians’ trust in AI decision support systems. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–14.
  63. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
  64. Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 295–305. DOI:http://dx.doi.org/10.1145/3351095.3372852 
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zana Buçinca (9 papers)
  2. Siddharth Swaroop (17 papers)
  3. Amanda E. Paluch (2 papers)
  4. Susan A. Murphy (35 papers)
  5. Krzysztof Z. Gajos (15 papers)
Citations (6)
X Twitter Logo Streamline Icon: https://streamlinehq.com