Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SLANG: New Concept Comprehension of Large Language Models (2401.12585v6)

Published 23 Jan 2024 in cs.CL

Abstract: The dynamic nature of language, particularly evident in the realm of slang and memes on the Internet, poses serious challenges to the adaptability of LLMs. Traditionally anchored to static datasets, these models often struggle to keep up with the rapid linguistic evolution characteristic of online communities. This research aims to bridge this gap by enhancing LLMs' comprehension of the evolving new concepts on the Internet, without the high cost of continual retraining. In pursuit of this goal, we introduce $\textbf{SLANG}$, a benchmark designed to autonomously integrate novel data and assess LLMs' ability to comprehend emerging concepts, alongside $\textbf{FOCUS}$, an approach uses causal inference to enhance LLMs to understand new phrases and their colloquial context. Our benchmark and approach involves understanding real-world instances of linguistic shifts, serving as contextual beacons, to form more precise and contextually relevant connections between newly emerging expressions and their meanings. The empirical analysis shows that our causal inference-based approach outperforms the baseline methods in terms of precision and relevance in the comprehension of Internet slang and memes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Advances in Neural Information Processing Systems, volume 34, pages 27036–27047.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Causal intervention and counterfactual reasoning for multi-modal fake news detection. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 627–638, Toronto, Canada. Association for Computational Linguistics.
  4. Shortcut learning of large language models in natural language understanding: A survey. arXiv preprint arXiv:2208.11857.
  5. Towards interpreting and mitigating shortcut learning behavior of NLU models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 915–929, Online. Association for Computational Linguistics.
  6. Causalm: Causal model explanation through counterfactual language models. Computational Linguistics, 47(2):333–386.
  7. The “online brain”: how the internet may be changing our cognition. World Psychiatry, 18(2):119–129.
  8. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  9. Harald Hammarström. 2016. Linguistic diversity and language evolution. Journal of Language Evolution, 1(1):19–29.
  10. Parameter-efficient transfer learning for nlp. arXiv preprint arXiv:1902.00751.
  11. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  12. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059.
  13. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
  14. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  15. Entity-based knowledge conflicts in question answering. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7052–7063, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  16. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, volume 35, pages 27730–27744. Curran Associates, Inc.
  17. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
  18. Learning from Context or Names? An Empirical Study on Neural Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3661–3672, Online. Association for Computational Linguistics.
  19. Adapterhub: A framework for adapting transformers. In 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54.
  20. Language models are greedy reasoners: A systematic formal analysis of chain-of-thought. In The Eleventh International Conference on Learning Representations.
  21. Mpnet: Masked and permuted pre-training for language understanding. Advances in Neural Information Processing Systems, 33:16857–16867.
  22. A computational framework for slang generation. Transactions of the Association for Computational Linguistics, 9:462–478.
  23. Large language models can be lazy learners: Analyze shortcuts in in-context learning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 4645–4657, Toronto, Canada. Association for Computational Linguistics.
  24. Sentence Transformers Team. 2023. all-mpnet-base-v2: Sentence transformer model. https://huggingface.co/sentence-transformers/all-mpnet-base-v2.
  25. Debiasing nlu models via causal intervention and counterfactual reasoning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11376–11384.
  26. Urban Dictionary LLC. Urban dictionary. https://www.urbandictionary.com. Accessed: 2023-12-27, © 1999-2023 Urban Dictionary, LLC.
  27. Piia Varis and Tom van Nuenen. 2017. 473The Internet, Language, and Virtual Interactions. In The Oxford Handbook of Language and Society. Oxford University Press.
  28. Thomas Verma and Judea Pearl. 1990. Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90, page 255–270, USA. Elsevier Science Inc.
  29. A causal view of entity bias in (large) language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 15173–15184, Singapore. Association for Computational Linguistics.
  30. Should we rely on entity mentions for relation extraction? debiasing relation extraction with counterfactual analysis. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3071–3081, Seattle, United States. Association for Computational Linguistics.
  31. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  32. AutoCAD: Automatically generate counterfactuals for mitigating shortcut learning. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2302–2317, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  33. Out-of-distribution generalization in natural language processing: Past, present, and future. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4533–4559.
  34. HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics.
  35. MQuAKE: Assessing knowledge editing in language models via multi-hop questions. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15686–15702, Singapore. Association for Computational Linguistics.
  36. Causal-debias: Unifying debiasing in pretrained language models and fine-tuning via causal invariant learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4227–4241, Toronto, Canada. Association for Computational Linguistics.
  37. Context-faithful prompting for large language models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14544–14556, Singapore. Association for Computational Linguistics.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets