Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs (2401.04854v3)

Published 10 Jan 2024 in cs.CL

Abstract: Are LLMs cultural technologies like photocopiers or printing presses, which transmit information but cannot create new content? A challenge for this idea, which we call bibliotechnism, is that LLMs generate novel text. We begin with a defense of bibliotechnism, showing how even novel text may inherit its meaning from original human-generated text. We then argue that bibliotechnism faces an independent challenge from examples in which LLMs generate novel reference, using new names to refer to new entities. Such examples could be explained if LLMs were not cultural technologies but had beliefs, desires, and intentions. According to interpretationism in the philosophy of mind, a system has such attitudes if and only if its behavior is well explained by the hypothesis that it does. Interpretationists may hold that LLMs have attitudes, and thus have a simple solution to the novel reference problem. We emphasize, however, that interpretationism is compatible with very simple creatures having attitudes and differs sharply from views that presuppose these attitudes require consciousness, sentience, or intelligence (topics about which we make no claims).

Understanding the Intricacies of LLM References

Exploring the Bibliotechnism Hypothesis

LLMs have fueled debates in various academic fields, particularly in how we understand their relationship with linguistic references. Some argue that LLMs, like conventional cultural technologies such as photocopiers or libraries, merely transmit existing information rather than create original content. This perspective, dubbed "bibliotechnism," suggests that any novel text produced by LLMs is actually derivative, meaning its significance is linked to the human-generated text it mirrors. However, a conceptual gap in this hypothesis surfaces when LLMs generate text that seems to invent references—naming previously anonymous entities or creating entirely new expressions. Can we still classify LLMs as mere cultural tools if they demonstrate such abilities that seem to mimic human creativity?

Aligning With Philosophical Interpretationism

The latest philosophical discussions delve into whether the seemingly novel references made by LLMs imply some form of agency, akin in certain respects to human-like cognitive states such as beliefs, desires, and intentions. The paper examines these queries through the lens of "interpretationism," a doctrine in the philosophy of mind which suggests that such mental states are ascribed to an entity if its behavior is thoroughly explicable by assuming it possesses these states. By analyzing the cases where LLMs appear to create new references, researchers propound that the best explanation might indeed include the existence of cognitive states, pushing the boundaries of how we conceive LLM capacities.

Delving into Previous Studies and Their Limitations

Prior scholarship has grappled with questions around the ability of LLMs to produce meaningful language. Some posit that genuine reference requires sensory experiences, while others believe that LLMs can meaningfully employ words based on inferential relationships within a conceptual framework. Nevertheless, these discussions have not fully addressed the challenges presented by LLM-generated novel text and reference. The assertion that LLMs are part of our language community, and hence capable of reference within it, is also scrutinized, emphasizing the necessity of establishing precise benchmarks that connect LLM output with real-world elements.

Assessing LLMs' Referential Generative Abilities

The paper goes further to present the concept of "novel reference," where LLMs use unprecedented names to denote novel entities. This defies the assumption that LLMs are purely imitative since no prior associations exist within the original human text for such references. The researchers scrutinize several potential sources of derivative meaning—ranging from the human feedback in model training to prompt generation and reader interpretation—all to discern whether they could account for LLMs' capacity for novel reference. Despite exploring these avenues, the paper arrives at the notion that LLMs might indeed harbor a rudimentary form of agency, given the complexity and originality of some of the text they generate.

In summary, the debate over whether LLMs possess a type of agency indicative of beliefs, desires, and intentions remains spirited, with the research suggesting that understanding this may require an extensive investigation of their behaviors. The intricacies of how LLMs process and generate language pose a fascinating window into artificial intelligence, touching on deeper philosophical questions about cognition and the essence of creativity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Jacob Andreas. 2022. Language models as agent models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5769–5779, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  2. Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
  3. Emily M. Bender and Alexander Koller. 2020. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5185–5198, Online. Association for Computational Linguistics.
  4. Experience grounds language. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8718–8735, Online. Association for Computational Linguistics.
  5. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
  6. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
  7. Tyler Burge. 1986. Individualism and psychology. The Philosophical Review, 95(1):3–45.
  8. Herman Cappelen and Josh Dever. 2021. Making AI intelligible: Philosophical foundations. Oxford University Press.
  9. David J Chalmers. 2023. Could a large language model be conscious? arXiv preprint arXiv:2303.07103.
  10. Ted Chiang. 2023. Chatgpt is a blurry jpeg of the web. New Yorker.
  11. Dimitri Coelho Mollo and Raphaël Millière. 2023. The vector grounding problem. arXiv preprint arXiv:2304.01481.
  12. Donald Davidson. 1973. Radical interpretation. Dialectica, pages 313–328.
  13. Donald Davidson. 1986. A coherence theory of truth and knowledge. Epistemology: an anthology, pages 124–133.
  14. Daniel C Dennett. 1971. Intentional systems. The Journal of Philosophy, 68(4):87–106.
  15. Daniel C Dennett. 1989. The Intentional Stance. MIT press.
  16. Keith S Donnellan. 1970. Proper names and identifying descriptions. Synthese, 21:335–358.
  17. Gareth Evans. 1973. The causal theory of names. Proceedings of the Aristotelian Society, Supplementary Volumes, 47:187–225.
  18. Michael C Frank. 2023. Baby steps in evaluating the capacities of large language models. Nature Reviews Psychology, 2(8):451–452.
  19. Peter Thomas Geach. 1969. The perils of pauline. The Review of Metaphysics, pages 287–300.
  20. Simon Goldstein and Cameron Domenico Kirk-Giannini. manuscript. AI wellbeing.
  21. Alison Gopnik. 2022a. Children, creativity, and the real key to intelligence. Observer.
  22. Alison Gopnik. 2022b. What ai still doesn’t know how to do. The Wall Street Journal.
  23. H Paul Grice. 1957. Meaning. The Philosophical Review, 66(3):377–388.
  24. Methods for measuring, updating, and visualizing factual beliefs in language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2714–2731, Dubrovnik, Croatia. Association for Computational Linguistics.
  25. Saul A Kripke. 1980. Naming and necessity. In Semantics of natural language, pages 253–355. Springer.
  26. Brenden M Lake and Gregory L Murphy. 2023. Word meaning in minds and machines. Psychological review, 130(2):401.
  27. Dissociating language and thought in large language models: a cognitive perspective. arXiv preprint arXiv:2301.06627.
  28. Matthew Mandelkern and Tal Linzen. 2023. Do language models refer? arXiv preprint arXiv:2308.05576.
  29. John McCarthy. 1979. Ascribing mental qualities to machines. Stanford University. Computer Science Department.
  30. How much do language models copy from their training data? evaluating linguistic novelty in text generation using RAVEN. Transactions of the Association for Computational Linguistics, 11:652–670.
  31. Melanie Mitchell and David C Krakauer. 2023. The debate over understanding in ai’s large language models. Proceedings of the National Academy of Sciences, 120(13):e2215907120.
  32. Gary Ostertag. 2023. Large language models and externalism about reference: Some negative results. (unpublished manuscript).
  33. Roma Patel and Ellie Pavlick. 2021. Mapping language models to grounded conceptual spaces. In International Conference on Learning Representations.
  34. Steven T Piantadosi and Felix Hill. 2022. Meaning without reference in large language models. arXiv preprint arXiv:2208.02957.
  35. Hilary Putnam. 1975. The meaning of “meaning”. In Minnesota Studies in the Philosophy of Science, Volume 7: Language, Mind, and Knowledge, pages 131–193. University of Minnesota Press, Minneapolis.
  36. Eric Schwitzgebel. 2023. Belief. In Edward N. Zalta and Uri Nodelman, editors, The Stanford Encyclopedia of Philosophy, Winter 2023 edition. Metaphysics Research Lab, Stanford University.
  37. Cosma Shalizi. 2023. "Attention", "Transformers", in Neural Network "Large Language Models".
  38. Role play with large language models. Nature, pages 1–6.
  39. Transmission versus truth, imitation versus innovation: What children can do that large language and language-and-vision models cannot (yet). Perspectives on Psychological Science, page 17456916231201401.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Harvey Lederman (5 papers)
  2. Kyle Mahowald (40 papers)
Citations (8)
Youtube Logo Streamline Icon: https://streamlinehq.com