Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles (2406.06840v2)

Published 10 Jun 2024 in cs.CL and cs.LG

Abstract: A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. Dog whistling historically originated from United States politics, but in recent years has taken root in social media as a means of evading hate speech detection systems and maintaining plausible deniability. In this paper, we present an approach for word-sense disambiguation of dog whistles from standard speech using LLMs, and leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication. Silent Signals is the largest dataset of disambiguated dog whistle usage, created for applications in hate speech detection, neology, and political science. The dataset can be found at https://huggingface.co/datasets/SALT-NLP/silent_signals.

Summary

The paper introduces the Silent Signals dataset, the largest collection with 16,550 examples of coded dog whistles for hate speech analysis.
It demonstrates an LLM-based methodology using models like GPT-3.5 and GPT-4 to disambiguate complex, covert political language.
Experiments reveal that while GPT-4 achieved up to 96.2% precision, challenges remain in consistently detecting and interpreting nuanced dog whistles.

Insights into Coded Language Detection Using LLMs

The paper "Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles" provides a comprehensive examination of the utilization of LLMs in the detection and interpretation of coded language within political and social contexts. Dog whistles, a form of covert communication, often serve as a tool for propagating discrimination while maintaining plausible deniability. The transient and evolving nature of dog whistles, particularly their increasing usage on social media platforms, poses significant challenges for content moderation and hate speech detection, thereby necessitating robust methodologies for their identification and understanding.

Main Contributions

Silent Signals Dataset: The research introduces the Silent Signals dataset, comprising 16,550 examples of coded dog whistle usage. This dataset is the largest of its kind, assembled to aid applications in hate speech detection and political discourse analysis. It consists of both formal communications (from Congressional records) and informal communications (from Reddit postings).
LLM-based Methodology: The paper explores the application of advanced LLMs such as GPT-3.5 and GPT-4 for word-sense disambiguation of dog whistles. By experimenting with different models, the researchers aim to discern their efficacy in detecting nuanced meanings that escape traditional hate speech detection mechanisms.
Task Innovation: The research delineates a novel task of dog whistle word-sense disambiguation, enabling a structured approach to differentiate between innocuous and coded language uses, thereby enhancing the effectiveness of hate speech detection systems.
Evaluation and Experiments: Comprehensive evaluations were conducted using synthetic datasets for model training and testing. The experiments demonstrated varying levels of success across models, with GPT-4 showing the most promising results in some contexts. Yet, these findings also highlighted challenges, such as inconsistent performance and lower precision rates in different tasks.

Strong Numerical Results and Claims

Performance Benchmarks: The research provides detailed metrics from the automatic detection experiments, with the best-performing models achieving F1-scores below optimal for complex tasks like dog whistle definition. This underscores the ongoing difficulty in LLM capabilities regarding subtle language nuances.
Disambiguation Precision: Using a simulated ensemble approach, GPT-4 achieved a precision of 96.2% on coded dog whistle instances, indicating a high level of accuracy when consistent predictions were made across multiple inferences.

Implications and Future Directions

The implications of this paper are far-reaching in both theoretical exploration and practical applications. The Silent Signals dataset forms a crucial foundation for further computational social science research, offering insights into the longitudinal dynamics of coded language use in political contexts. Additionally, this resource proposes a methodological framework that can extend beyond U.S. discourse, potentially applying to coded speech in various linguistic and cultural settings worldwide.

In practical settings, this research could guide the development of more sophisticated automated systems for content moderation, focusing on implicit language and its potential for discrimination. Given the rapid churn rate of neologisms among dog whistles, it is imperative that future research continuously updates datasets and refines models to adapt to emerging lingual trends.

Moreover, this paper contributes a reflection on the philosophical inquiries surrounding intent and recognition in language—deliberations on whether unconscious use of coded messages falls within the scope of 'dog whistling'. This opens an avenue for further interdisciplinary dialogue, inviting contributions from linguistics, political science, and philosophy on the ethical and social impacts of such communication.

Conclusion

This paper effectively interrogates the capacities and limitations of using LLMs for the disambiguation of politically and socially charged language. It advances the conversation on hate speech detection, emphasizing the nuances involved in coded language and the pivotal role of comprehensive datasets in informing model training. As LLM technologies continue to evolve, addressing these challenges will be crucial in safeguarding public discourse integrity on digital platforms.

PDF Markdown

Related Papers

Tweets

https://twitter.com/JuliaKruk__/status/1823178019705372824

https://twitter.com/JuliaKruk__/status/1807806071337058706