On Formally Undecidable Traits of Intelligent Machines (2402.09500v1)
Abstract: Building on work by Alfonseca et al. (2021), we study the conditions necessary for it to be logically possible to prove that an arbitrary artificially intelligent machine will exhibit certain behavior. To do this, we develop a formalism like -- but mathematically distinct from -- the theory of formal languages and their properties. Our formalism affords a precise means for not only talking about the traits we desire of machines (such as them being intelligent, contained, moral, and so forth), but also for detailing the conditions necessary for it to be logically possible to decide whether a given arbitrary machine possesses such a trait or not. Contrary to Alfonseca et al.'s (2021) results, we find that Rice's theorem from computability theory cannot in general be used to determine whether an arbitrary machine possesses a given trait or not. Therefore, it is not necessarily the case that deciding whether an arbitrary machine is intelligent, contained, moral, and so forth is logically impossible.
- Achiam, J., et al. (2023). GPT-4 technical report. arXiv preprint arXiv:2303.08774.
- Alfonseca, M., et al. (2021). Superintelligence cannot be contained: Lessons from computability theory. J. Artif. Intell. Res., 70, 65–76.
- Allcott, H., et al. (2020). The welfare effects of social media. Am Econ Rev, 110(3), 629–676.
- AI risk skepticism, a comprehensive survey. arXiv preprint arXiv:2303.03885.
- Amodei, D., et al. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.
- Computational Complexity: A Modern Approach (1st edition). Cambridge University Press.
- Asimov, I. (1950). I, Robot (1st edition). Doubleday.
- In Search of Planet Vulcan, The Ghost of Newton’s Clockwork Machine. Plenum Press.
- Benson, H. P. (1998). An outer approximation algorithm for generating all efficient extreme points in the outcome set of a multiple objective linear programming problem. J Glob Optim, 13(1), 1–24.
- Blum, M. (1967). A machine-independent theory of the complexity of recursive functions. J. ACM, 14(2), 322–336.
- Bostrom, N. (2011). Information hazards: A typology of potential harms from knowledge. Rev. Contemp. Philos., 10, 44–79.
- Bostrom, N. (2014). Superintelligence: Paths, Dangers, and Strategies. Oxford University Press.
- Bubeck, S., et al. (2023). Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712.
- Chalmers, D. (2010). The singularity: A philosophical analysis. J. Conscious Stud., 17, 7–65.
- Chollet, F. (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547.
- Corwin, J. (2002). AI boxing. Online. Accessed 12 December 2023.
- The potential for artificial intelligence in healthcare. Future Healthc J., 6, 94–98.
- Degrave, J., et al. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, 414–419.
- Deutsch, D. (1985). Quantum theory, the Church-Turing principle and the universal quantum computer. Proc. R. Soc. Lond. A, 400, 97–117.
- Fawzi, A., et al. (2022). Discovering faster matrix multiplication algorithms with reinforcement learning. Nature, 610, 47–53.
- Frank, M. C. (2023). Baby steps in evaluating the capacities of large language models. Nat Rev Psychol, 2, 451–452.
- Processing speed, working memory, and fluid intelligence: Evidence for a developmental cascade. Psychol Sci, 7, 237–241.
- Ginsberg, A. (1959). Howl and Other Poems (Reissue edition). City Lights Publishers.
- Good, I. J. (1966). Speculations concerning the first ultraintelligent machine. Adv. Comput., 6, 31–88.
- Grace, K., et al. (2024). Thousands of AI authors on the future of AI. arXiv preprint arXiv:2401.02843.
- DARPA’s explainable artificial intelligence (XAI) program. AI Magazine, 40, 44–58.
- Hadfield-Menell, D., et al. (2017). The off-switch game. In Proceedings of the 26th International Joint Conference on Artificial Intelligence.
- Haidt, J. (2001). The emotional dog and its rational tail. Psychol Rev, 104, 814–834.
- Haier, R. J. (2016). The Neuroscience of Intelligence (1st edition). Cambridge University Press.
- Stephen Hawking: ‘Transcendence looks at the implications of artificial intelligence—but are we taking AI seriously enough?’. Online. Accessed 13 October 2023.
- Hendrycks, D., et al. (2022). Unsolved problems in ML safety. arXiv preprint arXiv:2109.13916.
- The Mind’s I: Fantasies and Reflections on Self and Soul (1st edition). Basic Books.
- Huang, L., et al. (2023). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
- Johnson, G. (1997). To test a powerful computer, play an ancient game. Online. Accessed 03 January 2024.
- Jumper, J., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596, 583–589.
- Kamalov, F., et al. (2023). New era of artificial intelligence in education: Towards a sustainable multifaceted revolution. Sustainability, 15, 12451.
- Kleene, S. C. (1952). Introduction to Metamathematics (1st edition). P. Noordhoff N.V., Groningen.
- Knight, W. (2017). The dark secret at the heart of AI. Online. Accessed 24 December 2023.
- Kozen, D. (2006). Theory of Computation. Springer-Verlag, London.
- Lampson, B. W. (1973). A note on the confinement problem. Commun. ACM, 16(10), 613–615.
- Leben, D. (2018). Ethics for Robots: How to Design a Moral Algorithm (1st edition). Routledge.
- Lipton, Z. C. (2018). The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.. Queue, 16(3), 31–57.
- Madhavan, A. (2023). Brain-inspired computing can help us create faster, more energy-efficient devices—if we win the race. Online. Accessed 16 January 2024.
- Marshall, A. (2023). GM’s cruise loses its self-driving license in San Francisco after a robotaxi dragged a person. Online. Accessed 20 December 2023.
- McCarthy, J. (1980). Circumscription—a form of non-monotonic reasoning. Artificial Intelligence, 13, 27–39.
- McCorduck, P. (1979). Machines Who Think: A Personal Inquiry into the History and Prospects of Artificial Intelligence (1st edition). W. H. Freeman.
- Moravec, H. (1988). Mind Children (1st edition). Harvard University Press.
- Quantum Computation and Quantum Information (Anniversary edition). Cambridge University Press.
- Quine, W. V. (1951). Two dogmas of empiricism. Philos Rev, 60(1), 20–43.
- Rice, H. G. (1953). Classes of recursively enumerable sets and their decision problems. Trans. Am. Math. Soc., 74, 358–366.
- Robiĉ, B. (2020). The Foundations of Computability Theory (2nd edition). Springer.
- Why are we using black box models in AI when we don’t need to? a lesson from an explainable AI competition. Online. Accessed 24 December 2023.
- Russell, B. (1972). The philosophy of logical atomism (1st edition). Fontana.
- Russell, S. (2017). Provably beneficial artificial intelligence. In The Next Step: Exponential Life. BBVA OpenMind.
- Russell, S. (2019). Human Compatible: Artificial Intelligence and the Problem of Control (1st edition). Penguin Books.
- Russell, S. (2022). Artificial Intelligence and the Problem of Control, pp. 19–24. Springer International Publishing, Cham.
- Artificial Intelligence: A Modern Approach (4th edition). Pearson Education, Inc.
- Savage, N. (2022). Breaking into the black box of artificial intelligence. Online. Accessed 24 December 2023.
- Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–424.
- Sevilla, J., et al. (2022). Compute trends across three eras of machine learning. In 2022 International Joint Conference on Neural Networks (IJCNN), pp. 1–8.
- Sipser, M. (2013). Introduction to the Theory of Computation (3rd edition). Cengage Learning.
- Strogatz, S. (2022). Can computers be mathematicians?. Online. Accessed 24 January 2024.
- Sutskever, I. (2023). The exciting, perilous journey toward AGI. Online. Accessed 17 November 2023.
- Tabbaa, B. (2018). The rise and fall of Knight Capital—buy high, sell low. Rinse and repeat. Online. Accessed 20 December 2023.
- Tegmark, M. (2017). Life 3.0: Being Human in the Age of Artificial Intelligence (1st edition). Knopf.
- Turing, A. M. (1936). On computable numbers, with an application to the entscheidungproblem. Proc. Lond. Math. Soc., 42, 230–265.
- Turing, A. M. (1948). Intelligent machinery. National Physical Laboratory.
- Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–446.
- Turing, A. M. (1954). Solvable and unsolvable problems. Science News, 31, 7–23.
- Moral Machines: Teaching Robots Right from Wrong (1st edition). Oxford University Press.
- Wilbur, M., et al. (2023). Artificial intelligence for smart transportation. arXiv preprint arXiv:2308.07457.
- Artificial intelligence act: deal on comprehensive rules for trustworthy AI. Online. Accessed 28 December 2023.
- Yampolskiy, R. V. (2012). Leakproofing the singularity: Artificial intelligence confinement problem. J. Conscious Stud., 19, 194–214.
- Yampolskiy, R. V. (2020). On controllability of AI. arXiv preprint arXiv:2008.04071.
- Yudkowsky, E. (2002). The AI-box experiment. Online. Accessed 12 December 2023.
- Yudkowsky, E. (2008). Artificial intelligence as a positive and negative factor in global risk. In Bostrom, N., and Cirkovic, M. M. (Eds.), Global Catastrophic Risks, pp. 308–345. Oxford University Press.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.