Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI (2010.07487v3)

Published 15 Oct 2020 in cs.AI and cs.CY

Abstract: Trust is a central component of the interaction between people and AI, in that 'incorrect' levels of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the nature of trust in AI? What are the prerequisites and goals of the cognitive mechanism of trust, and how can we promote them, or assess whether they are being satisfied in a given interaction? This work aims to answer these questions. We discuss a model of trust inspired by, but not identical to, sociology's interpersonal trust (i.e., trust between people). This model rests on two key properties of the vulnerability of the user and the ability to anticipate the impact of the AI model's decisions. We incorporate a formalization of 'contractual trust', such that trust between a user and an AI is trust that some implicit or explicit contract will hold, and a formalization of 'trustworthiness' (which detaches from the notion of trustworthiness in sociology), and with it concepts of 'warranted' and 'unwarranted' trust. We then present the possible causes of warranted trust as intrinsic reasoning and extrinsic behavior, and discuss how to design trustworthy AI, how to evaluate whether trust has manifested, and whether it is warranted. Finally, we elucidate the connection between trust and XAI using our formalization.

PDF Abstract

Formalizing Trust in AI: A Comprehensive Examination

The paper "Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI" by Jacovi et al. provides an exhaustive exploration into the essence of trust between humans and AI systems. In this paper, the authors propose a rigorous framework to understand and evaluate trust in AI, outlining the essential requirements, potential causes, and ultimate objectives of human trust within AI interactions.

Key Concepts and Definitions

At the heart of this research is the construction of a model inspired by interpersonal trust, recognizing key elements such as vulnerability and anticipation. The authors introduce the idea of contractual trust, where the interaction is perceived as a trustor relying on an implicit or explicit agreement that the AI will behave within an expected framework. Importantly, this model is distinct from traditional sociological interpretations, instead uniquely tailored for AI contexts.

The paper delineates trustworthiness from trust, emphasizing that trustworthiness refers to the AI's capacity to adhere to the contractual expectations set forth and maintain these under varying conditions. In contrast, trust exists if the user perceives the AI as capable despite inherent risks. The delineation of warranted vs. unwarranted trust is crucial, distinguishing between trust aligned with the AI's genuine capabilities and trust that is baseless, potentially leading to misuse.

Analytical Insights

The authors thoroughly parse intrinsic and extrinsic causes of trust. Intrinsic trust is derived from the AI's reasoning process aligning with user expectations, promoted through avenues like explainable AI (XAI). Extrinsic trust, in contrast, is contingent upon observable behaviors and system performance under predefined evaluation protocols. The duty of XAI is further clarified as facilitating user capability to comprehend, anticipate and trust model outputs responsibly.

The paper critiques the limited discussions of trust across extant literature and sets forth a bold step by aligning AI trust with specific social constructs and behavioral expectations. They address the alignment of trust mechanisms with the broader ambitions of XAI: to provide verifiable, understandable, and actionable insights into AI operations.

Methodological Contributions

Throughout, Jacovi et al. challenge readers to reconceptualize trust evaluation. They propose practical methods for comprehensively assessing trust through experimental manipulation of model trustworthiness in attempts to quantify both trust and its causes. This involves differentiating warranted trust, which is theoretically aligned with trustworthiness, from situational biases leading to unwarranted trust.

Practical and Theoretical Implications

The implications of this work stretch across both theoretical and applied AI landscapes. Practically, developers are called to design systems with explicit contracts addressing user needs, emphasizing transparency and reliability. Trust and distrust become tools to navigate the integration of AI within societal contexts ethically. Theoretically, the paper forges a nuanced dialogue on trust, expanding the scope of future AI research to consider social dimensions and context-specific contracts.

Future Directions

The paper acknowledges the nascent stage of comprehensive trust evaluation methodologies and suggests further research into differentiating warranted and unwarranted trust within real-world applications. Encouragingly, it opens avenues for more detailed exploration into personal attributes affecting trust behaviors.

In conclusion, Jacovi et al.'s paper stands as a critical contribution to AI ethics and trust engineering, offering a structured roadmap for deploying AI technologies that are not only functional but are also perceived as reliably trustworthy by users. This foundational framework is integral in advancing the deployment of AI systems that responsibly integrate into human workflows.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Alon Jacovi (26 papers)
Ana Marasović (27 papers)
Tim Miller (53 papers)
Yoav Goldberg (142 papers)

Citations (358)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/yoavgo/status/1910053683427250538

YouTube

Show All Videos