Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 82 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 179 tok/s Pro
GPT OSS 120B 473 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms (2401.14526v1)

Published 25 Jan 2024 in cs.CL

Abstract: This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic "categories" such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. How do languages influence each other? studying cross-lingual data sharing during LM fine-tuning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13244–13257, Singapore. Association for Computational Linguistics.
  2. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
  3. Christian Felt and Ellen Riloff. 2020. Recognizing euphemisms and dysphemisms using sentiment analysis. In Proceedings of the Second Workshop on Figurative Language Processing, pages 136–145.
  4. CATs are fuzzy PETs: A corpus and analysis of potentially euphemistic terms. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2658–2671, Marseille, France. European Language Resources Association.
  5. Andrew F Hayes and Klaus Krippendorff. 2007. Answering the call for a standard reliability measure for coding data. Communication methods and measures, 1(1):77–89.
  6. Sedrick Scott Keh. 2022. Exploring euphemism detection in few-shot and zero-shot settings. In Proceedings of the 3rd Workshop on Figurative Language Processing (FLP), pages 167–172, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  7. A report on the euphemisms detection shared task. In Proceedings of the 3rd Workshop on Figurative Language Processing (FLP), pages 184–190, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  8. Searching for PETs: Using distributional and sentiment-based methods to find potentially euphemistic terms. In Proceedings of the Second Workshop on Understanding Implicit and Underspecified Language, pages 22–32, Seattle, USA. Association for Computational Linguistics.
  9. FEED PETs: Further experimentation and expansion on the disambiguation of potentially euphemistic terms. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 437–448, Toronto, Canada. Association for Computational Linguistics.
  10. NollySenti: Leveraging transfer learning and machine translation for Nigerian movie sentiment classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 986–998, Toronto, Canada. Association for Computational Linguistics.
  11. Self-supervised euphemism detection and identification for content moderation. In 42nd IEEE Symposium on Security and Privacy.
Citations (4)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets