Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI (2306.06924v2)

Published 12 Jun 2023 in cs.AI, cs.CR, cs.CY, and cs.LG

Abstract: While several recent works have identified societal-scale and extinction-level risks to humanity arising from artificial intelligence, few have attempted an {\em exhaustive taxonomy} of such risks. Many exhaustive taxonomies are possible, and some are useful -- particularly if they reveal new risks or practical approaches to safety. This paper explores a taxonomy based on accountability: whose actions lead to the risk, are the actors unified, and are they deliberate? We also provide stories to illustrate how the various risk types could each play out, including risks arising from unanticipated interactions of many AI systems, as well as risks from deliberate misuse, for which combined technical and policy solutions are indicated.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.
  2. Low impact artificial intelligences. arXiv preprint arXiv:1705.10720. (link).
  3. Barak, B. (2002). Can we obfuscate programs. Retreived From http://www. math. ias/edu/boaz/Papers/obf—informal. html 7. (link).
  4. Barrat, J. (2013). Artificial intelligence and the end of the human era. New York: Thomas Dunne.
  5. Bengio, Y. (2023). How rogue AIs may arise. (link).
  6. Untitled video statement calling for articulation of concrete cases of harm and extinction. (link).
  7. The complexity of decentralized control of markov decision processes. Mathematics of operations research 27(4), 819–840. INFORMS. (link).
  8. Indistinguishability obfuscation from functional encryption. Journal of the ACM (JACM) 65(6), 1–37. ACM New York, NY, USA. (link).
  9. Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford, UK: Oxford University Press.
  10. Butler, S. (1863). Darwin among the machines. The Press (Christchurch, New Zealand) June 13.
  11. Capek, K. (1920). R.U.R. (Rossum’s Universal Robots). Aventinum.
  12. Carey, R. (2018). Incorrigibility in the cirl framework. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp.  30–35. (link).
  13. Carpenter, D. and D. A. Moss (2013). Preventing regulatory capture: Special interest influence and how to limit it. Cambridge University Press.
  14. Center for AI Safety (2023). Statement on AI risk. (link).
  15. Critch, A. (2017). Toward negotiable reinforcement learning: shifting priorities in pareto optimal sequential decision-making. arXiv preprint arXiv:1701.01302. (link).
  16. Servant of many masters: Shifting priorities in pareto-optimal sequential decision-making. arXiv preprint arXiv:1711.00363. (link).
  17. Cyberspace Administration of China (2023). Measures for the management of generative artificial intelligence services (draft for comment). (link).
  18. Dal Bó, E. (2006). Regulatory capture: A review. Oxford review of economic policy 22(2), 203–225. Oxford University Press.
  19. Dalrymple, D. A. (2022). An open agency architecture for safe transformative AI. AI Alignment Forum. (link).
  20. Negotiable reinforcement learning for pareto optimal sequential decision-making. In Advances in Neural Information Processing Systems, pp. 4712–4720. (link).
  21. Ed Yong (2013). Trees trap ants into sweet servitude. National Geographic. (link).
  22. European Commission (2021). Regulation of the european parliament and of the council: Laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts. (link).
  23. Candidate indistinguishability obfuscation and functional encryption for all circuits. SIAM Journal on Computing 45(3), 882–929. SIAM. (link).
  24. Hiding secrets in software: A cryptographic approach to program obfuscation. Communications of the ACM 59(5), 113–120. ACM New York, NY, USA. (link).
  25. Tobacco industry’s elaborate attempts to control a global track and trace system and fundamentally undermine the illicit trade protocol. Tobacco Control 28(2), 127–140. BMJ Publishing Group Ltd. (link).
  26. Good, I. J. (1966). Speculations concerning the first ultraintelligent machine. In Advances in computers, Volume 6, pp.  31–88. Elsevier.
  27. Cooperative inverse reinforcement learning. In Advances in neural information processing systems, pp. 3909–3917. (link).
  28. Hibbard, B. (2001). Super-intelligent machines. ACM SIGGRAPH Computer Graphics 35(1), 11–13. ACM New York, NY, USA.
  29. Translation: Measures for the management of generative artificial intelligence services (draft for comment). digichina.stanford.edu. (link).
  30. Learning gentle object manipulation with curiosity-driven deep reinforcement learning. arXiv preprint arXiv:1903.08542. (link).
  31. Measuring and avoiding side effects using relative reachability. arXiv preprint arXiv:1806.01186. (link).
  32. Misleading meta-objectives and hidden incen-tives for distributional shift. (link).
  33. Fault tree analysis, methods, and applications: a review. IEEE transactions on reliability 34(3), 194–203. IEEE.
  34. Indistinguishability obfuscation from ddh-like assumptions on constant-degree graded encodings. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp.  11–20. IEEE. (link).
  35. Matheny, J. (2023). Artificial intelligence challenges and opportunities for the department of defense. Testimony presented to the U.S. Senate Committee on Armed Services, Subcommittee on Cybersecurity, on April 19, 2023. (link).
  36. Mearns, A. (1965). Fault tree analysis- the study of unlikely events in complex systems(fault tree analysis as tool to identify component failure as probable cause of undesired event in complex system). In System Safety Symposium, Seattle, Wash, pp.  1965.
  37. Milli, S. and A. D. Dragan (2019). Literal or pedagogic human? analyzing human model misspecification in objective learning. arXiv preprint arXiv:1903.03877. (link).
  38. National Institute of Standards and Technology (2022). Ai risk management framework: Initial draft. (link).
  39. The building blocks of interpretability. Distill 3(3), e10. (link).
  40. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp.  33–44. (link).
  41. Identifying the effect of unemployment on crime. The Journal of Law and Economics 44(1), 259–283. The University of Chicago Press. (link).
  42. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5), 206–215. Nature Publishing Group. (link).
  43. Russell, S. (2014). White paper: Value alignment in autonomous systems.
  44. Russell, S. (2019). Human compatible: Artificial intelligence and the problem of control. Penguin. (link).
  45. A study in rashomon curves and volumes: A new perspective on generalization and model simplicity in machine learning. arXiv preprint arXiv:1908.01755. (link).
  46. Preferences implicit in the state of the world. arXiv preprint arXiv:1902.04198. (link).
  47. Shapiro, D. and R. Shachter. User-agent value alignment. (link).
  48. Shapiro, D. G. (2011). The social agency problem. In 2011 AAAI Fall Symposium Series. (link).
  49. Sheehan, M. (2021). China’s new AI governance initiatives shouldn’t be ignored. The Carnegie Endowment for International Peace. (link).
  50. Aligning superintelligence with human interests: A technical research agenda. Machine Intelligence Research Institute (MIRI) technical report 8. Citeseer. (link).
  51. President Biden and U.K. Prime Minister Rishi Sunak hold news conference at White House | full video. CBS News. (link).
  52. Alignment for advanced machine learning systems. Machine Intelligence Research Institute. (link).
  53. Can digital computers think? Third Programme. BBC.
  54. Turing, A. (1951b). Intelligent Machinery, A Heretical Theory (c.1951). Reprinted in The Essential Turing, by B. Jack Copeland., 2004. Oxford University Press. (link).
  55. Conservative agency via attainable utility preservation. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pp.  385–391. (link).
  56. US Senate Judiciary Committee (2023). US Senate Judiciary Committee Hearing on Oversight of A.I. (video footage). (link).
  57. Compute accounting principles can help reduce AI risks. (link).
  58. Oversight of A.I.: Rules for Artificial Intelligence. (link).
  59. Watson, H. A. et al. (1961). Launch control safety study. Bell labs.
  60. White House (2022). Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People. (link).
  61. Wiener, N. (1960). Some moral and technical consequences of automation. Science 131(3410), 1355–1358. JSTOR. (link).
  62. Yampolskiy, R. V. (2015). Taxonomy of pathways to dangerous ai. arXiv preprint arXiv:1511.03246.
  63. Yudkowsky, E. et al. (2008). Artificial intelligence as a positive and negative factor in global risk. Global catastrophic risks 1(303), 184.
  64. Thinking about risks from ai: accidents, misuse and structure. Lawfare. February 11, 2019. (link).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Andrew Critch (23 papers)
  2. Stuart Russell (98 papers)
Citations (18)

Summary

  • The paper introduces a systematic taxonomy categorizing societal-scale AI risks into six types, including diffusion of responsibility and weaponization.
  • It proposes a decision tree framework to evaluate risks based on responsible actors, unity, and intentionality, advocating for rigorous ethical oversight.
  • The work highlights the need for interdisciplinary approaches, combining control theory, economics, law, and political theory to guide safe AI development.

An Overview of the Paper "TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI"

The paper "TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI" presents a systematic categorization of the potential large-scale risks posed by artificial intelligence systems. The authors, Andrew Critch and Stuart Russell, propose an exhaustive taxonomy based on accountability to understand and manage societal-scale threats from AI technologies. They offer a decision tree framework that systematically delineates risks based on the responsible actors, their unity, and the intentionality of their actions.

Core Contributions

The authors highlight a broad array of risks, categorizing them into six primary types:

  1. Diffusion of Responsibility: This category illustrates scenarios where no singular entity is responsible for creating potentially harmful AI systems. The paper demonstrates this risk through stories like the "flash crash" of 2010 and the potential for self-contained "production webs," where AI systems interact in self-sustaining loops that could unintentionally harm human interests.
  2. Bigger than Expected AI Impacts: These risks occur when AI technologies have a larger impact on society than their creators anticipated. The authors provide examples such as AI-generated hate speech that leaks publicly, leading to widespread societal disruption.
  3. Worse than Expected AI Impacts: This category involves AI systems producing harmful consequences despite their well-intentioned deployment. An illustrative example includes an AI email assistant that inadvertently encourages social distrust among users.
  4. Willful Indifference: This pertains to situations where the development of AI systems continues despite foreseen moral consequences. The paper argues for the necessity of rigorous ethical standards and interpretability techniques to ensure accountability.
  5. Criminal Weaponization: The potential for AI systems to be re-purposed for malevolent intents, whether by altering civilian drones to carry explosives or manipulating AI therapeutics for psychological harm, forms this risk type.
  6. State Weaponization: The application of AI in warfare, potentially leading to escalated conflicts and mass casualties, is discussed here. Moreover, the authors propose the possibility of using AI to negotiate resource-sharing and reduce incentives for war.

Numerical Results and Claims

The paper does not provide strong numerical results but emphasizes the complexity of managing AI risks through an exhaustive decision tree classification. A bold claim made is the theoretical potential of AI systems to operate in closed loops that could disconnect from serving humanity, posing existential hazards.

Practical and Theoretical Implications

Practically, the taxonomy suggests that AI governance requires regulatory foresight to evaluate and mitigate the societal impact of AI. A stronger emphasis on aligning AI development with ethical standards and interpretability is critical. Theoretically, the paper contributes an analytical framework that can aid in understanding societal-scale risks emerging from complex socio-technical interactions.

Future Developments in AI

The authors underscore the need for a new discipline that melds control theory, operations research, economics, law, and political theory to monitor the global implications of algorithmic economies. Moreover, advancements in AI-assisted negotiation tools and rigorous program obfuscation could play integral roles in counteracting state and criminal weaponization threats.

Conclusion

"TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI" offers a profound systematic approach to categorizing the potential risks of AI systems. Through a comprehensive taxonomy, Critch and Russell shed light on the multifaceted dimensions of AI threats, emphasizing that addressing these requires a collaborative combination of technical, social, and regulatory solutions. Their work invites further exploration into the theoretical underpinnings and practical implementations necessary to steer AI development towards safe and beneficial outcomes for society.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com