Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Human Sequential Decision-Making with Reinforcement Learning (2108.08454v5)

Published 19 Aug 2021 in cs.LG and cs.HC

Abstract: Workers spend a significant amount of time learning how to make good decisions. Evaluating the efficacy of a given decision, however, can be complicated -- e.g., decision outcomes are often long-term and relate to the original decision in complex ways. Surprisingly, even though learning good decision-making strategies is difficult, they can often be expressed in simple and concise forms. Focusing on sequential decision-making, we design a novel machine learning algorithm that is capable of extracting "best practices" from trace data and conveying its insights to humans in the form of interpretable "tips". Our algorithm selects the tip that best bridges the gap between the actions taken by human workers and those taken by the optimal policy in a way that accounts for which actions are consequential for achieving higher performance. We evaluate our approach through a series of randomized controlled experiments where participants manage a virtual kitchen. Our experiments show that the tips generated by our algorithm can significantly improve human performance relative to intuitive baselines. In addition, we discuss a number of empirical insights that can help inform the design of algorithms intended for human-AI interfaces. For instance, we find evidence that participants do not simply blindly follow our tips; instead, they combine them with their own experience to discover additional strategies for improving performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Allcott H (2011) Social norms and energy conservation. Journal of public Economics 95(9-10):1082–1095.
  2. Argote L (2012) Organizational learning: Creating, retaining and transferring knowledge (Springer Science & Business Media).
  3. Bavafa H, Jónasson JO (2021) Recovering from critical incidents: Evidence from paramedic performance. Manufacturing & Service Operations Management 23(4):914–932.
  4. Bertsimas D, Dunn J (2017) Optimal classification trees. Machine Learning 106(7):1039–1082.
  5. Breiman L (2001) Random forests. Machine learning 45(1):5–32.
  6. Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 .
  7. Fudenberg D, Liang A (2019) Predicting and understanding initial play. American Economic Review 109(12):4112–41.
  8. Giuffrida A, Torgerson DJ (1997) Should we pay the patient? review of financial incentives to enhance patient compliance. Bmj 315(7110):703–707.
  9. Gleicher M (2016) A framework for considering comprehensibility in modeling. Big data 4(2):75–88.
  10. Green B, Chen Y (2019) The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3(CSCW):1–24.
  11. Huckman RS, Pisano GP (2006) The firm specificity of individual performance: Evidence from cardiac surgery. Management Science 52(4):473–488.
  12. Kc DS, Staats BR (2012) Accumulating a portfolio of experience: The effect of focal and related experience on surgeon performance. Manufacturing & Service Operations Management 14(4):618–633.
  13. Kneusel RT, Mozer MC (2017) Improving human-machine cooperative visual search with soft highlighting. ACM Transactions on Applied Perception (TAP) 15(1):1–21.
  14. Marshall A (2020) Uber changes its rules, and drivers adjust their strategies. URL https://www.wired.com/story/uber-changes-rules-drivers-adjust-strategies/.
  15. Nonaka I, Takeuchi H (1995) The knowledge-creating company: How Japanese companies create the dynamics of innovation (Oxford university press).
  16. Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5):206–215.
  17. Spear SJ (2005) Fixing health care from the inside, today. Harvard business review 83(9):78.
  18. Sull DN, Eisenhardt KM (2015) Simple rules: How to thrive in a complex world (Houghton Mifflin Harcourt).
  19. Sutton RS, Barto AG (2018) Reinforcement learning: An introduction (MIT press).
  20. Szulanski G (1996) Exploring internal stickiness: Impediments to the transfer of best practice within the firm. Strategic management journal 17(S2):27–43.
  21. Watkins CJ, Dayan P (1992) Q-learning. Machine learning 8(3-4):279–292.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets