Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions (2405.05776v1)
Abstract: Human communication is based on a variety of inferences that we draw from sentences, often going beyond what is literally said. While there is wide agreement on the basic distinction between entailment, implicature, and presupposition, the status of many inferences remains controversial. In this paper, we focus on three inferences of plain and embedded disjunctions, and compare them with regular scalar implicatures. We investigate this comparison from the novel perspective of the predictions of state-of-the-art LLMs, using the same experimental paradigms as recent studies investigating the same inferences with humans. The results of our best performing models mostly align with those of humans, both in the large differences we find between those inferences and implicatures, as well as in fine-grained distinctions among different aspects of those inferences.
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleFalcon-40B: an open large language model with state-of-the-art performance Falcon-40B: an open large language model with state-of-the-art performance.\BBCQ \PrintBackRefs\CurrentBib
- \APACinsertmetastaraloni2022logic{APACrefauthors}Aloni, M. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleLogic and conversation: the case of free choice Logic and conversation: the case of free choice.\BBCQ \APACjournalVolNumPagesSemantics and Pragmatics155–EA. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleFree choice, simplification, and innocent inclusion Free choice, simplification, and innocent inclusion.\BBCQ \APACjournalVolNumPagesNatural Language Semantics283175–223. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitlePythia: A suite for analyzing large language models across training and scaling Pythia: A suite for analyzing large language models across training and scaling.\BBCQ \BIn \APACrefbtitleInternational Conference on Machine Learning International conference on machine learning (\BPGS 2397–2430). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleLanguage Models are Few-Shot Learners Language models are few-shot learners.\BBCQ \BIn H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan\BCBL \BBA H. Lin (\BEDS), \APACrefbtitleAdvances in Neural Information Processing Systems Advances in neural information processing systems (\BVOL 33, \BPGS 1877–1901). \APACaddressPublisherCurran Associates, Inc. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2015. \BBOQ\APACrefatitleScalar implicatures of embedded disjunction Scalar implicatures of embedded disjunction.\BBCQ \APACjournalVolNumPagesNatural Language Semantics234271–305. {APACrefDOI} \doi10.1007/s11050-015-9116-x \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \APACrefbtitleThe ups and downs of ignorance. The ups and downs of ignorance. {APACrefURL} \urlhttps://ling.auf.net/lingbuzz/007389 \APACrefnoteunder review \PrintBackRefs\CurrentBib
- \APACinsertmetastarFox2007{APACrefauthors}Fox, D. \APACrefYearMonthDay2007. \BBOQ\APACrefatitleFree Choice and the Theory of Scalar Implicatures Free choice and the theory of scalar implicatures.\BBCQ \BIn U. Sauerland \BBA P. Stateva (\BEDS), \APACrefbtitlePresupposition and Implicature in Compositional Semantics Presupposition and implicature in compositional semantics (\BPGS 71–120). \APACaddressPublisherLondonPalgrave Macmillan UK. {APACrefURL} \urlhttps://doi.org/10.1057/9780230210752_4 {APACrefDOI} \doi10.1057/9780230210752_4 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020\APACmonth07. \BBOQ\APACrefatitleSyntaxGym: An Online Platform for Targeted Evaluation of Language Models SyntaxGym: An online platform for targeted evaluation of language models.\BBCQ \BIn \APACrefbtitleProceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations Proceedings of the 58th annual meeting of the association for computational linguistics: System demonstrations (\BPGS 70–76). \APACaddressPublisherOnlineAssociation for Computational Linguistics. \PrintBackRefs\CurrentBib
- \APACinsertmetastarGrice1975:Logic-and-Conve{APACrefauthors}Grice, P\BPBIH. \APACrefYearMonthDay1975. \BBOQ\APACrefatitleLogic and Conversation Logic and conversation.\BBCQ \BIn P. Cole \BBA J\BPBIL. Morgan (\BEDS), \APACrefbtitleSyntax and Semantics, Vol. 3, Speech Acts Syntax and semantics, vol. 3, speech acts (\BPGS 41–58). \APACaddressPublisherNew YorkAcademic Press. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitleSurface Form Competition: Why the Highest Probability Answer Isn’t Always Right Surface form competition: Why the highest probability answer isn’t always right.\BBCQ \BIn \APACrefbtitleProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Proceedings of the 2021 conference on empirical methods in natural language processing (\BPGS 7038–7051). \APACaddressPublisherAssociation for Computational Linguistics. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2024. \BBOQ\APACrefatitleAuxiliary task demands mask the capabilities of smaller language models Auxiliary task demands mask the capabilities of smaller language models.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2404.02418. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2022\APACmonth05. \BBOQ\APACrefatitlePredicting scalar diversity with context-driven uncertainty over alternatives Predicting scalar diversity with context-driven uncertainty over alternatives.\BBCQ \BIn E. Chersoni, N. Hollenstein, C. Jacobs, Y. Oseki, L. Prévot\BCBL \BBA E. Santus (\BEDS), \APACrefbtitleProceedings of the Workshop on Cognitive Modeling and Computational Linguistics Proceedings of the workshop on cognitive modeling and computational linguistics (\BPGS 68–74). \APACaddressPublisherDublin, IrelandAssociation for Computational Linguistics. {APACrefURL} \urlhttps://aclanthology.org/2022.cmcl-1.8 {APACrefDOI} \doi10.18653/v1/2022.cmcl-1.8 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020\APACmonth07. \BBOQ\APACrefatitleAre Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition Are natural language inference models IMPPRESsive? Learning IMPlicature and PRESupposition.\BBCQ \BIn D. Jurafsky, J. Chai, N. Schluter\BCBL \BBA J. Tetreault (\BEDS), \APACrefbtitleProceedings of the 58th Annual Meeting of the Association for Computational Linguistics Proceedings of the 58th annual meeting of the association for computational linguistics (\BPGS 8690–8705). \APACaddressPublisherOnlineAssociation for Computational Linguistics. {APACrefURL} \urlhttps://aclanthology.org/2020.acl-main.768 {APACrefDOI} \doi10.18653/v1/2020.acl-main.768 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleMistral 7B Mistral 7B.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2310.06825. \PrintBackRefs\CurrentBib
- \APACinsertmetastarKamp1978{APACrefauthors}Kamp, H. \APACrefYearMonthDay1978. \BBOQ\APACrefatitleSemantics Versus Pragmatics Semantics versus pragmatics.\BBCQ \BIn F. Guenthner \BBA S\BPBIJ. Schmidt (\BEDS), \APACrefbtitleFormal Semantics and Pragmatics for Natural Languages Formal semantics and pragmatics for natural languages (\BPGS 255–287). \APACaddressPublisherDordrechtSpringer Netherlands. {APACrefURL} \urlhttps://doi.org/10.1007/978-94-009-9775-2_9 {APACrefDOI} \doi10.1007/978-94-009-9775-2_9 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021. \BBOQ\APACrefatitlePredicting scalar inferences from “or” to “not both” using neural sentence encoders Predicting scalar inferences from “or” to “not both” using neural sentence encoders.\BBCQ \BIn \APACrefbtitleProceedings of the Society for Computation in Linguistics 2021 Proceedings of the society for computation in linguistics 2021 (\BPGS 446–450). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleTextbooks are all you need II: phi-1.5 technical report Textbooks are all you need II: phi-1.5 technical report.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2309.05463. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2024. \BBOQ\APACrefatitleLost in the middle: How language models use long contexts Lost in the middle: How language models use long contexts.\BBCQ \APACjournalVolNumPagesTransactions of the Association for Computational Linguistics12157–173. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \APACrefbtitleOn the source of distributive inferences. On the source of distributive inferences. {APACrefURL} \urlhttps://lingbuzz.net/lingbuzz/007623?_s=6Vr1IH6Shz7nms3w&_k=-kuDvMqnl4piFlbG \APACrefnoteunder review \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleWhat makes an inference robust? What makes an inference robust?\BBCQ \APACjournalVolNumPagesJournal of Semantics. {APACrefURL} \urlhttps://lingbuzz.net/lingbuzz/006205?_s=pEQAHafzjamqK7zo&_k=6O7AfYta3cP962OR \PrintBackRefs\CurrentBib
- \APACinsertmetastarmiles2005r{APACrefauthors}Miles, J. \APACrefYearMonthDay2005. \BBOQ\APACrefatitleR-squared, adjusted R-squared R-squared, adjusted r-squared.\BBCQ \APACjournalVolNumPagesEncyclopedia of statistics in behavioral science. \PrintBackRefs\CurrentBib
- \APACinsertmetastarmistralai2023mixtral{APACrefauthors}MistralAI. \APACrefYearMonthDay2023. \APACrefbtitleMixtral of experts. A high quality Sparse Mixture-of-Experts. Mixtral of experts. a high quality sparse mixture-of-experts. {APACrefURL} \urlhttps://mistral.ai/news/mixtral-of-experts/ \PrintBackRefs\CurrentBib
- \APACinsertmetastarnoveck2018experimental{APACrefauthors}Noveck, I. \APACrefYear2018. \APACrefbtitleExperimental Pragmatics: The Making of a Cognitive Science Experimental Pragmatics: The Making of a Cognitive Science. \APACaddressPublisherCambridge University Press. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2021\APACmonth11. \BBOQ\APACrefatitleNOPE: A Corpus of Naturally-Occurring Presuppositions in English NOPE: A corpus of naturally-occurring presuppositions in English.\BBCQ \BIn A. Bisazza \BBA O. Abend (\BEDS), \APACrefbtitleProceedings of the 25th Conference on Computational Natural Language Learning Proceedings of the 25th conference on computational natural language learning (\BPGS 349–366). \APACaddressPublisherOnlineAssociation for Computational Linguistics. {APACrefURL} \urlhttps://aclanthology.org/2021.conll-1.28 {APACrefDOI} \doi10.18653/v1/2021.conll-1.28 \PrintBackRefs\CurrentBib
- \APACinsertmetastarsauerland2004scalar{APACrefauthors}Sauerland, U. \APACrefYearMonthDay2004. \BBOQ\APACrefatitleScalar implicatures in complex sentences Scalar implicatures in complex sentences.\BBCQ \APACjournalVolNumPagesLinguistics and Philosophy27367–391. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020\APACmonth07. \BBOQ\APACrefatitleHarnessing the linguistic signal to predict scalar inferences Harnessing the linguistic signal to predict scalar inferences.\BBCQ \BIn D. Jurafsky, J. Chai, N. Schluter\BCBL \BBA J. Tetreault (\BEDS), \APACrefbtitleProceedings of the 58th Annual Meeting of the Association for Computational Linguistics Proceedings of the 58th annual meeting of the association for computational linguistics (\BPGS 5387–5403). \APACaddressPublisherOnlineAssociation for Computational Linguistics. {APACrefURL} \urlhttps://aclanthology.org/2020.acl-main.479 {APACrefDOI} \doi10.18653/v1/2020.acl-main.479 \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleWhen Your Language Model Cannot Even Do Determiners Right: Probing for Anti-Presuppositions and the Maximize Presupposition! Principle When your language model cannot even do determiners right: Probing for anti-presuppositions and the maximize presupposition! principle.\BBCQ \BIn \APACrefbtitleProceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP Proceedings of the 6th blackboxnlp workshop: Analyzing and interpreting neural networks for NLP (\BPGS 180–198). \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2024. \BBOQ\APACrefatitlePUB: A Pragmatics Understanding Benchmark for Assessing LLMs’ Pragmatics Capabilities PUB: A pragmatics understanding benchmark for assessing llms’ pragmatics capabilities.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2401.07078. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleLlama 2: Open foundation and fine-tuned chat models Llama 2: Open foundation and fine-tuned chat models.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2307.09288. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2023. \BBOQ\APACrefatitleLanguage models are not naysayers: An analysis of language models on negation benchmarks Language models are not naysayers: An analysis of language models on negation benchmarks.\BBCQ \APACjournalVolNumPagesarXiv preprint arXiv:2306.08189. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2019. \BBOQ\APACrefatitleSuperglue: A stickier benchmark for general-purpose language understanding systems Superglue: A stickier benchmark for general-purpose language understanding systems.\BBCQ \APACjournalVolNumPagesAdvances in neural information processing systems32. \PrintBackRefs\CurrentBib
- \APACrefYearMonthDay2020. \BBOQ\APACrefatitleBLiMP: The Benchmark of Linguistic Minimal Pairs for English BLiMP: The benchmark of linguistic minimal pairs for English.\BBCQ \APACjournalVolNumPagesTransactions of the Association for Computational Linguistics8377–392. {APACrefURL} \urlhttps://aclanthology.org/2020.tacl-1.25 {APACrefDOI} \doi10.1162/tacl_a_00321 \PrintBackRefs\CurrentBib