Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs (2401.04854v3)
Abstract: Are LLMs cultural technologies like photocopiers or printing presses, which transmit information but cannot create new content? A challenge for this idea, which we call bibliotechnism, is that LLMs generate novel text. We begin with a defense of bibliotechnism, showing how even novel text may inherit its meaning from original human-generated text. We then argue that bibliotechnism faces an independent challenge from examples in which LLMs generate novel reference, using new names to refer to new entities. Such examples could be explained if LLMs were not cultural technologies but had beliefs, desires, and intentions. According to interpretationism in the philosophy of mind, a system has such attitudes if and only if its behavior is well explained by the hypothesis that it does. Interpretationists may hold that LLMs have attitudes, and thus have a simple solution to the novel reference problem. We emphasize, however, that interpretationism is compatible with very simple creatures having attitudes and differs sharply from views that presuppose these attitudes require consciousness, sentience, or intelligence (topics about which we make no claims).
- Jacob Andreas. 2022. Language models as agent models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5769–5779, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
- Emily M. Bender and Alexander Koller. 2020. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5185–5198, Online. Association for Computational Linguistics.
- Experience grounds language. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8718–8735, Online. Association for Computational Linguistics.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- Tyler Burge. 1986. Individualism and psychology. The Philosophical Review, 95(1):3–45.
- Herman Cappelen and Josh Dever. 2021. Making AI intelligible: Philosophical foundations. Oxford University Press.
- David J Chalmers. 2023. Could a large language model be conscious? arXiv preprint arXiv:2303.07103.
- Ted Chiang. 2023. Chatgpt is a blurry jpeg of the web. New Yorker.
- Dimitri Coelho Mollo and Raphaël Millière. 2023. The vector grounding problem. arXiv preprint arXiv:2304.01481.
- Donald Davidson. 1973. Radical interpretation. Dialectica, pages 313–328.
- Donald Davidson. 1986. A coherence theory of truth and knowledge. Epistemology: an anthology, pages 124–133.
- Daniel C Dennett. 1971. Intentional systems. The Journal of Philosophy, 68(4):87–106.
- Daniel C Dennett. 1989. The Intentional Stance. MIT press.
- Keith S Donnellan. 1970. Proper names and identifying descriptions. Synthese, 21:335–358.
- Gareth Evans. 1973. The causal theory of names. Proceedings of the Aristotelian Society, Supplementary Volumes, 47:187–225.
- Michael C Frank. 2023. Baby steps in evaluating the capacities of large language models. Nature Reviews Psychology, 2(8):451–452.
- Peter Thomas Geach. 1969. The perils of pauline. The Review of Metaphysics, pages 287–300.
- Simon Goldstein and Cameron Domenico Kirk-Giannini. manuscript. AI wellbeing.
- Alison Gopnik. 2022a. Children, creativity, and the real key to intelligence. Observer.
- Alison Gopnik. 2022b. What ai still doesn’t know how to do. The Wall Street Journal.
- H Paul Grice. 1957. Meaning. The Philosophical Review, 66(3):377–388.
- Methods for measuring, updating, and visualizing factual beliefs in language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2714–2731, Dubrovnik, Croatia. Association for Computational Linguistics.
- Saul A Kripke. 1980. Naming and necessity. In Semantics of natural language, pages 253–355. Springer.
- Brenden M Lake and Gregory L Murphy. 2023. Word meaning in minds and machines. Psychological review, 130(2):401.
- Dissociating language and thought in large language models: a cognitive perspective. arXiv preprint arXiv:2301.06627.
- Matthew Mandelkern and Tal Linzen. 2023. Do language models refer? arXiv preprint arXiv:2308.05576.
- John McCarthy. 1979. Ascribing mental qualities to machines. Stanford University. Computer Science Department.
- How much do language models copy from their training data? evaluating linguistic novelty in text generation using RAVEN. Transactions of the Association for Computational Linguistics, 11:652–670.
- Melanie Mitchell and David C Krakauer. 2023. The debate over understanding in ai’s large language models. Proceedings of the National Academy of Sciences, 120(13):e2215907120.
- Gary Ostertag. 2023. Large language models and externalism about reference: Some negative results. (unpublished manuscript).
- Roma Patel and Ellie Pavlick. 2021. Mapping language models to grounded conceptual spaces. In International Conference on Learning Representations.
- Steven T Piantadosi and Felix Hill. 2022. Meaning without reference in large language models. arXiv preprint arXiv:2208.02957.
- Hilary Putnam. 1975. The meaning of “meaning”. In Minnesota Studies in the Philosophy of Science, Volume 7: Language, Mind, and Knowledge, pages 131–193. University of Minnesota Press, Minneapolis.
- Eric Schwitzgebel. 2023. Belief. In Edward N. Zalta and Uri Nodelman, editors, The Stanford Encyclopedia of Philosophy, Winter 2023 edition. Metaphysics Research Lab, Stanford University.
- Cosma Shalizi. 2023. "Attention", "Transformers", in Neural Network "Large Language Models".
- Role play with large language models. Nature, pages 1–6.
- Transmission versus truth, imitation versus innovation: What children can do that large language and language-and-vision models cannot (yet). Perspectives on Psychological Science, page 17456916231201401.