TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI (2306.06924v2)

Published 12 Jun 2023 in cs.AI, cs.CR, cs.CY, and cs.LG

Abstract: While several recent works have identified societal-scale and extinction-level risks to humanity arising from artificial intelligence, few have attempted an {\em exhaustive taxonomy} of such risks. Many exhaustive taxonomies are possible, and some are useful -- particularly if they reveal new risks or practical approaches to safety. This paper explores a taxonomy based on accountability: whose actions lead to the risk, are the actors unified, and are they deliberate? We also provide stories to illustrate how the various risk types could each play out, including risks arising from unanticipated interactions of many AI systems, as well as risks from deliberate misuse, for which combined technical and policy solutions are indicated.

References (64)

Authors (2)

Andrew Critch (23 papers)
Stuart Russell (98 papers)

Citations (18)

View on Semantic Scholar

Summary

The paper introduces a systematic taxonomy categorizing societal-scale AI risks into six types, including diffusion of responsibility and weaponization.
It proposes a decision tree framework to evaluate risks based on responsible actors, unity, and intentionality, advocating for rigorous ethical oversight.
The work highlights the need for interdisciplinary approaches, combining control theory, economics, law, and political theory to guide safe AI development.

An Overview of the Paper "TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI"

The paper "TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI" presents a systematic categorization of the potential large-scale risks posed by artificial intelligence systems. The authors, Andrew Critch and Stuart Russell, propose an exhaustive taxonomy based on accountability to understand and manage societal-scale threats from AI technologies. They offer a decision tree framework that systematically delineates risks based on the responsible actors, their unity, and the intentionality of their actions.

Core Contributions

The authors highlight a broad array of risks, categorizing them into six primary types:

Diffusion of Responsibility: This category illustrates scenarios where no singular entity is responsible for creating potentially harmful AI systems. The paper demonstrates this risk through stories like the "flash crash" of 2010 and the potential for self-contained "production webs," where AI systems interact in self-sustaining loops that could unintentionally harm human interests.
Bigger than Expected AI Impacts: These risks occur when AI technologies have a larger impact on society than their creators anticipated. The authors provide examples such as AI-generated hate speech that leaks publicly, leading to widespread societal disruption.
Worse than Expected AI Impacts: This category involves AI systems producing harmful consequences despite their well-intentioned deployment. An illustrative example includes an AI email assistant that inadvertently encourages social distrust among users.
Willful Indifference: This pertains to situations where the development of AI systems continues despite foreseen moral consequences. The paper argues for the necessity of rigorous ethical standards and interpretability techniques to ensure accountability.
Criminal Weaponization: The potential for AI systems to be re-purposed for malevolent intents, whether by altering civilian drones to carry explosives or manipulating AI therapeutics for psychological harm, forms this risk type.
State Weaponization: The application of AI in warfare, potentially leading to escalated conflicts and mass casualties, is discussed here. Moreover, the authors propose the possibility of using AI to negotiate resource-sharing and reduce incentives for war.

Numerical Results and Claims

The paper does not provide strong numerical results but emphasizes the complexity of managing AI risks through an exhaustive decision tree classification. A bold claim made is the theoretical potential of AI systems to operate in closed loops that could disconnect from serving humanity, posing existential hazards.

Practical and Theoretical Implications

Practically, the taxonomy suggests that AI governance requires regulatory foresight to evaluate and mitigate the societal impact of AI. A stronger emphasis on aligning AI development with ethical standards and interpretability is critical. Theoretically, the paper contributes an analytical framework that can aid in understanding societal-scale risks emerging from complex socio-technical interactions.

Future Developments in AI

The authors underscore the need for a new discipline that melds control theory, operations research, economics, law, and political theory to monitor the global implications of algorithmic economies. Moreover, advancements in AI-assisted negotiation tools and rigorous program obfuscation could play integral roles in counteracting state and criminal weaponization threats.

Conclusion

"TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI" offers a profound systematic approach to categorizing the potential risks of AI systems. Through a comprehensive taxonomy, Critch and Russell shed light on the multifaceted dimensions of AI threats, emphasizing that addressing these requires a collaborative combination of technical, social, and regulatory solutions. Their work invites further exploration into the theoretical underpinnings and practical implementations necessary to steer AI development towards safe and beneficial outcomes for society.

PDF Markdown

Related Papers

Tweets

https://twitter.com/AndrewCritchPhD/status/1744092921031266466

YouTube

Show All Videos