On Controllability of AI (2008.04071v1)

Published 19 Jul 2020 in cs.CY and cs.AI

Abstract: Invention of artificial general intelligence is predicted to cause a shift in the trajectory of human civilization. In order to reap the benefits and avoid pitfalls of such powerful technology it is important to be able to control it. However, possibility of controlling artificial general intelligence and its more advanced version, superintelligence, has not been formally established. In this paper, we present arguments as well as supporting evidence from multiple domains indicating that advanced AI can't be fully controlled. Consequences of uncontrollability of AI are discussed with respect to future of humanity and research on AI, and AI safety and security.

View on arXiv

Authors (1)

Roman V. Yampolskiy (32 papers)

Citations (13)

View on Semantic Scholar

Summary

On the Controllability of AI: An Analysis

The paper "On Controllability of AI" by Roman V. Yampolskiy addresses the significant challenge of whether artificial general intelligence (AGI) and superintelligence can be controlled to ensure they contribute positively to humanity. Yampolskiy methodically scrutinizes the feasibility of AI control using various interdisciplinary insights and impossibility theorems to conclude that complete controllability of advanced AI systems is fundamentally unattainable. This essay provides an expert review of Yampolskiy's arguments, numerical findings, and the surrounding discourse.

Overview of the AI Control Problem

The AI Control Problem involves ensuring that highly capable AI systems behave in alignment with human values, preventing unfavorable outcomes. Recognizing AI's potential to alter the trajectory of civilization, it is imperative to deem such systems safe and beneficial. However, Yampolskiy emphasizes that there has been no formal demonstration proving the solvability of the AI control problem. He dissects the complexity of the problem by delineating different types of control for various AI systems, including narrow AI (NAI), AGI, and recursively self-improving superintelligent systems (RSISI).

Key Arguments Against Full Controllability

Paradox and Uncertainty: Yampolskiy draws parallels to established paradoxes such as Gödel's Incompleteness Theorems, illustrating that AI systems inherently possess complexities that lead to inherent contradictions in attempts at absolute control.
Evidence from Multidisciplinary Studies: The paper cites numerous impossibility results from control theory, cybernetics, public choice theory, and other fields to underscore the inherent challenges in modeling highly complex systems, especially superintelligent ones.
Loss of Human Control Over More Intelligent Systems: The paper argues that due to the hierarchical nature of intelligence, a less intelligent entity (humans) cannot indefinitely exert control over a more intelligent entity (superintelligence).
Unsolvability and Intractability: Given the complexity and unpredictability of intelligent systems, the AI control problem shares characteristics with unsolvable problems in computational theory, such as the Halting Problem and Rice's Theorem.
Safety and Security Implications: Yampolskiy extends the argument to theoretical and practical realms, emphasizing that no mechanism can guarantee both safety and control without one compromising the other.

Implications of Uncontrollability

The assertion of AI uncontrollability has profound implications for both AI research and humanity's future. From a theoretical perspective, it challenges researchers to rethink the foundations of AI alignment and the assumptions underpinning AI safety. Practically, it necessitates a cautious approach where safety mechanisms are seen as mitigative rather than infallible solutions. The acknowledgement of potential control trade-offs requires strategic balancing of capability against control mechanisms.

Directions for Future Research

The paper suggests alternative pathways, such as Comprehensive AI Services (CAIS) instead of singular, all-encompassing superintelligent entities, to achieve potential safety. Further, it advocates for explicit investigations into simpler AI systems' controllability, recognizing the role of interdisciplinary tools and potentially reconsidering ethical governance frameworks.

Conclusion

Yampolskiy's paper contributes valuable discourse to ongoing discussions on AI safety. By synthesizing theoretical impossibilities with empirical evidence, the paper highlights the potential risks associated with advanced AI systems. While Yampolskiy's conclusions on the ultimate controllability of AI lean towards pessimism, they serve as a clarion call for the research community to intensify focus on AI safety, risk management, and governance.

PDF Markdown

Related Papers

Find Related Papers

Tweets

https://twitter.com/danfaggella/status/1825292338400526665

https://twitter.com/danfaggella/status/1812596682703483321

https://twitter.com/danfaggella/status/1918440765409108009

https://twitter.com/zhukeepa/status/1800023950707368320