Language Evolution with Deep Learning (2403.11958v1)
Abstract: Computational modeling plays an essential role in the study of language emergence. It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language within a simulated controlled environment. Several methods have been used to investigate the origin of our language, including agent-based systems, Bayesian agents, genetic algorithms, and rule-based systems. This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models. The chapter introduces the basic concepts of deep and reinforcement learning methods and summarizes their helpfulness for simulating language emergence. It also discusses the key findings, limitations, and recent attempts to build realistic simulations. This chapter targets linguists and cognitive scientists seeking an introduction to deep learning as a tool to investigate language evolution.
- Marc D Hauser. The evolution of communication. MIT press, 1996.
- The faculty of language: what is it, who has it, and how did it evolve? science, 298(5598):1569–1579, 2002.
- Origins and evolution of language and speech. Language, 54(3):647–660, 1976. doi: 10.1352.
- Derek Bickerton. Language evolution: A brief guide for linguists. Lingua, 117(3):510–526, 2007.
- Luc Steels. The synthetic modeling of language origins. Evolution of communication, 1(1):1–34, 1997.
- Carol Myers-Scotton. Contact Linguistics: Bilingual Encounters and Grammatical Outcomes. Oxford University Press, Oxford, UK, 2002.
- Simon Kirby. Natural language from artificial life. Artificial life, 8(2):185–215, 2002.
- Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105(31):10681–10686, 2008.
- Larger communities create more systematic languages. Proceedings of the Royal Society B, 286(1907):20191262, 2019.
- Evolving artificial sign languages in the lab: From improvised gesture to systematic sign. Cognition, 192:103964, 2019.
- On the speech of neanderthal man. JANUA LINGUARUM, page 76, 1971.
- James R Hurford. Biological evolution of the saussurean sign as a component of the language acquisition device. Lingua, 77(2):187–222, 1989.
- Luc Steels. A self-organizing spatial vocabulary. Artificial life, 2(3):319–332, 1995.
- The evolution of frequency distributions: Relating regularization to inductive biases through iterated learning. Cognition, 111(3):317–328, 2009.
- Words as alleles: connecting language evolution with bayesian learners to models of genetic drift. Proceedings of the Royal Society B: Biological Sciences, 277(1680):429–436, 2010.
- Complex systems in language evolution: the cultural emergence of compositional structure. Advances in complex systems, 6(04):537–558, 2003.
- Paul Vogt. Modeling interactions between language evolution and demography. Human biology, 81(3):237–258, 2009.
- Modelling language evolution: Examples and predictions. Physics of life reviews, 11(2):280–302, 2014.
- Language change and social networks. Communications in Computational Physics, 3(4):935–949, 2008.
- Achieving compositional language in a population of iterated learners. In ECAL 2015: the 13th European Conference on Artificial Life, pages 349–356. MIT Press, 2015.
- The regularity game: Investigating linguistic rule dynamics in a population of interacting agents. Cognition, 159:25–32, 2017.
- Compression and communication in the cultural evolution of linguistic structure. Cognition, 141:87–102, 2015.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Natural language does not emerge’naturally’in multi-agent dialog. arXiv preprint arXiv:1706.08502, 2017.
- Emergence of linguistic communication from referential games with symbolic and pixel input. arXiv preprint arXiv:1804.03984, 2018.
- Emergent multi-agent communication in the deep learning era. arXiv preprint arXiv:2006.02419, 2020.
- David Kellogg Lewis. Convention: A Philosophical Study. Cambridge, MA, USA: Wiley-Blackwell, 1969.
- Modeling the emergence of universality in color naming patterns. Proceedings of the National Academy of Sciences, 107(6):2403–2407, 2010.
- Thomas M. Mitchell. Machine Learning. McGraw-Hill, Inc., New York, NY, USA, 1 edition, 1997.
- Steven Tadelis. Game theory: an introduction. Princeton university press, 2013.
- Strategic information transmission. Econometrica: Journal of the Econometric Society, pages 1431–1451, 1982.
- Brian Skyrms. Signals: Evolution, learning, and information. OUP Oxford, 2010.
- Claude E Shannon. A mathematical theory of communication. The Bell system technical journal, 27(3):379–423, 1948.
- John C Harsanyi. Games with incomplete information played by “bayesian” players, i–iii part i. the basic model. Management science, 14(3):159–182, 1967.
- Signaling Games and Stable Equilibria*. The Quarterly Journal of Economics, 102(2):179–221, 05 1987. ISSN 0033-5533. doi: 10.2307/1885060. URL https://doi.org/10.2307/1885060.
- Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pages 157–163. Elsevier, 1994.
- Over-communicate no more: Situated rl agents learn concise communication protocols. arXiv preprint arXiv:2211.01480, 2022.
- On the utility of learning about humans for human-ai coordination. Advances in neural information processing systems, 32, 2019.
- Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning, pages 3040–3049. PMLR, 2019.
- Automatic differentiation in pytorch. In Proc. of Advances in Neural Information Processing Systems (NeurIPS), 2017.
- JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
- Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Frank Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6):386, 1958.
- A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school, volume 1, pages 21–28, 1988.
- Kunihiko Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193–202, 1980.
- Jeffrey L Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
- Recurrent neural network based language model. In Interspeech, pages 1045–1048, 2010.
- Generating text with recurrent neural networks. In ICML, 2011.
- Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- On the difficulty of training recurrent neural networks. In International conference on machine learning, pages 1310–1318. Pmlr, 2013.
- An empirical analysis of compute-optimal large language model training. Advances in Neural Information Processing Systems, 35:30016–30030, 2022a.
- Anti-efficient encoding in emergent communication. Advances in Neural Information Processing Systems, 32, 2019.
- Ease-of-teaching and language structure from emergent communication. In Proc. of Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Emergent communication at scale. In Proc. of International Conference on Learning Representations (ICLR), 2022.
- On the role of population heterogeneity in emergent communication. In Proc. of International Conference on Learning Representations (ICLR), 2022a.
- Shaping representations through communication: community size effect in artificial learning systems. Visually Grounded Interaction and Language (ViGIL) Workshop, 2019.
- Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag, Berlin, Heidelberg, 2006.
- Deep learning. MIT press, 2016.
- Reinforcement learning: An introduction. MIT press, 2018.
- Learning representations by back-propagating errors. nature, 323(6088):533–536, 1986.
- Léon Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, pages 177–186. Springer, 2010.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Simon Kirby. Spontaneous evolution of linguistic structure-an iterated learning model of the emergence of regularity and irregularity. IEEE Transactions on Evolutionary Computation, 5(2):102–110, 2001.
- Emergent communication: Generalization and overfitting in lewis games. arXiv preprint arXiv:2209.15342, 2022b.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
- Dropblock: A regularization method for convolutional networks. Advances in neural information processing systems, 31, 2018.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
- Egg: a toolkit for research on emergence of language in games. arXiv preprint arXiv:1907.00852, 2019.
- Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
- The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712, 2016.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999.
- Experimental evidence on the evolution of meaning of messages in sender-receiver games. The American Economic Review, 88(5):1323–1340, 1998.
- Multi-agent cooperation and the emergence of (natural) language. arXiv preprint arXiv:1612.07182, 2016.
- Interpretable agent communication from scratch (with a generic visual processor emerging on the side). Advances in Neural Information Processing Systems, 34:26937–26949, 2021.
- Imagenet: A large-scale hierarchical image database. In Proc. of Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
- Imagenet large scale visual recognition challenge. International journal of computer vision (IJCV), 115(3):211–252, 2015.
- Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2901–2910, 2017.
- Deep learning face attributes in the wild. In Proc. of the International Conference on Computer Vision (ICCV), 2015.
- Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, 41(2):423–443, 2018.
- Feature-wise transformations. Distill, 3(7):e11, 2018.
- Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 32, 2019.
- Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
- Hyperparameter optimization. Automated machine learning: Methods, systems, challenges, pages 3–33, 2019.
- Compositionality and generalization in emergent languages. arXiv preprint arXiv:2004.09124, 2020.
- Compositional languages emerge in a neural iterated learning model. arXiv preprint arXiv:2002.01365, 2020.
- Tarmac: Targeted multi-agent communication. In International Conference on Machine Learning, pages 1538–1546. PMLR, 2019.
- " lazimpa": Lazy and impatient neural agents learn to communicate efficiently. arXiv preprint arXiv:2010.01878, 2020.
- Emergent communication in a multi-modal, multi-step referential game. arXiv preprint arXiv:1705.10369, 2017.
- How agents see things: On visual representations in an emergent language game. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 981–985, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. doi: 10.18653/v1/D18-1119. URL https://aclanthology.org/D18-1119.
- Emergent linguistic phenomena in multi-agent communication games. arXiv preprint arXiv:1901.08706, 2019.
- Neural agents struggle to take turns in bidirectional emergent communication. In The Eleventh International Conference on Learning Representations, 2023.
- Revisiting populations in multi-agent communication. Under review, 2023.
- Charles F Hockett. The origin of speech. Scientific American, 203(3):88–97, 1960.
- Understanding linguistic evolution by visualizing the emergence of topographic mappings. Artificial life, 12(2):229–242, 2006.
- Jacob Andreas. Measuring compositionality in representation learning. arXiv preprint arXiv:1902.07181, 2019.
- George Kingsley Zipf. The Principie of Least Effort. New York, Hafner Pubiishing Company, 1949.
- Word meanings across languages support efficient communication. The handbook of language emergence, pages 237–263, 2015.
- Michael Clyne. Linguistic and sociolinguistic aspects of language contact, maintenance and loss. Maintenance and loss of minority languages, 1:17, 1992.
- The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influences on linguistic form. Lingua, 117(3):543–578, 2007. ISSN 0024-3841. The Evolution of Language.
- Elliott Wagner. Communication and structured correlation. Erkenntnis, 71(3):377–393, 2009.
- Rick Dale Gary Lupyan. Language structure is partly determined by social structure. PLoS ONE 5, 1, 2010.
- Emergence of language with multi-agent games: Learning to communicate with sequences of symbols. Advances in neural information processing systems, 30, 2017.
- George Kingsley Zipf. Human behavior and the principle of least effort: An introduction to human ecology. Ravenio Books, 2016.
- Emergence of compositional language with deep generational transmission. arXiv preprint arXiv:1904.09067, 2019.
- Iterated learning and the evolution of language. Current opinion in neurobiology, 28:108–114, 2014.
- Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences, 108(9):3526–3529, 2011.
- Zipf’s law of abbreviation and the principle of least effort: Language users optimise a miniature lexicon for efficient communication. Cognition, 165:45–52, 2017.
- The influence of community on language structure: evidence from two young sign languages. Linguistic Variation, 12(2):247–291, 2012.
- Matthew S. Dryer and Martin Haspelmath, editors. WALS Online. Max Planck Institute for Evolutionary Anthropology, Leipzig, 2013. URL https://wals.info/.
- Simpler grammar, larger vocabulary: How population size affects language. Proc. of the Royal Society B: Biological Sciences, 285(1871):20172586, 2018. doi: 10.1098/rspb.2017.2586.
- Emergent communication under varying sizes and connectivities. Advances in Neural Information Processing Systems, 34:17579–17591, 2021.
- Exploring zero-shot emergent communication in embodied multi-agent populations. arXiv preprint arXiv:2010.15896, 2020.
- On the interaction between supervision and self-play in emergent communication. arXiv preprint arXiv:2002.01093, 2020.
- Emergent communication for understanding human language evolution: What’s missing?, 2022. URL https://arxiv.org/abs/2204.10590.
- Capacity, bandwidth, and compositionality in emergent language learning. arXiv preprint arXiv:1910.11424, 2019.
- André Martinet. Elements of a functional syntax. Word, 16(1):1–10, 1960.
- Charles F Hockett. A Leonard Bloomfield Anthology. ERIC, 1970.
- The sound pattern of English. ERIC, 1968.
- Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1-2):3–71, 1988.
- Jan Rijkhoff. Word classes. Language and linguistics compass, 1(6):709–726, 2007.
- John Lyons. Semantics: Volume 2, volume 2. Cambridge university press, 1977.
- Linking emergent and natural languages via corpus transfer. arXiv preprint arXiv:2203.13344, 2022.
- Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980, 2020.
- Rlprompt: Optimizing discrete text prompts with reinforcement learning. arXiv preprint arXiv:2205.12548, 2022.
- Jampatoisnli: A jamaican patois natural language inference dataset. arXiv preprint arXiv:2212.03419, 2022.
- Countering language drift with seeded iterated learning. In International Conference on Machine Learning, pages 6437–6447. PMLR, 2020.
- Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022b.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073, 2022.
- Emergent language-based coordination in deep multi-agent systems. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, pages 11–16, 2022.
- Mathieu Rita (7 papers)
- Paul Michel (27 papers)
- Rahma Chaabouni (15 papers)
- Olivier Pietquin (90 papers)
- Emmanuel Dupoux (81 papers)
- Florian Strub (39 papers)