Decoding trust: A reinforcement learning perspective (2309.14598v2)
Abstract: Behavioral experiments on the trust game have shown that trust and trustworthiness are universal among human beings, contradicting the prediction by assuming \emph{Homo economicus} in orthodox Economics. This means some mechanism must be at work that favors their emergence. Most previous explanations however need to resort to some factors based upon imitative learning, a simple version of social learning. Here, we turn to the paradigm of reinforcement learning, where individuals update their strategies by evaluating the long-term return through accumulated experience. Specifically, we investigate the trust game with the Q-learning algorithm, where each participant is associated with two evolving Q-tables that guide one's decision making as trustor and trustee respectively. In the pairwise scenario, we reveal that high levels of trust and trustworthiness emerge when individuals appreciate both their historical experience and returns in the future. Mechanistically, the evolution of the Q-tables shows a crossover that resembles human's psychological changes. We also provide the phase diagram for the game parameters, where the boundary analysis is conducted. These findings are robust when the scenario is extended to a latticed population. Our results thus provide a natural explanation for the emergence of trust and trustworthiness without external factors involved. More importantly, the proposed paradigm shows the potential in deciphering many puzzles in human behaviors.
- K. J. Arrow, The Limits of Organization (Norton & Company, 1974).
- P. J. Zak and S. Knack, The Economic Journal 111, 295 (2001).
- Y. Algan and P. Cahuc, Annual Review of Economics 5, 521 (2013).
- E. Ortiz-Ospina and M. Roser, Our World in Data (2016).
- R. M. Solow and H. A. Simon, Models of Man: Social and Rational (Willey, 1957).
- F. Fukuyama, Trust: The Social Virtues and the Creation of Prosperity (New York: Free Press, 1995).
- P. Samuelson and W. Nordhaus, Economics (18th edition) (McGraw-Hil Education, 2005).
- N. D. Johnson and A. A. Mislin, Journal of Economic Psychology 32, 865 (2011).
- L. G. Zucker, Research in Organizational Behavior 8, 53 (1986).
- M. A. Nowak, Nature 437, 1291 (2005).
- M. A. Nowak, Science 314, 1560 (2006).
- K. M. Page and N. K. Sigmund, Proceedings Biological Sciences 267, 2177 (2000).
- G. Szabó and C. Tőke, Phys. Rev. E 58, 69 (1998).
- A. Bandura and R. H. Walters, Social Learning Theory, Vol. 1 (Englewood cliffs Prentice Hall, 1977).
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (MIT press, 2018).
- M. Andrecut and M. Ali, Physical Review E 64, 067103 (2001).
- Y. Shi and Z. Rong, IEEE Transactions on Circuits and Systems II: Express Briefs 69, 2463 (2022).
- C. J. C. H. Watkins, Learning from delayed rewards (Ph.D. thesis), Ph.D. thesis (1989).
- P. Watkins, Christopher J. C. H.and Dayan, Machine Learning 8, 279 (1992).
- D. Fundenberg and E. Maskin, The American Economic Review 80, 274 (1990).
- K. Woolley and A. Fishbach, Journal of Consumer Psychology 27, 1 (2017).
- A. M. Evans and J. I. Krueger, Journal of Experimental Social Psychology 47, 171 (2011).
- P. Lenton and P. Mosley, Journal of Economic Psychology 32, 890 (2011).
- J. Engle-Warnick and R. L. Slonim, Journal of Economic Behavior &\&& Organization 55, 553 (2004).
- J. Engle-Warnick and R. L. Slonim, Economics Papers 2001, W15 (2001).
- C. Camerer and R. H. Thaler, Journal of Economic perspectives 9, 209 (1995).
- V. Capraro and M. Perc, Frontiers in Physics 6, 107 (2018).
- V. Capraro and M. Perc, Journal of the Royal Society interface 18, 20200880 (2021).