Enabling Contextual Soft Moderation on Social Media through Contrastive Textual Deviation (2407.20910v1)
Abstract: Automated soft moderation systems are unable to ascertain if a post supports or refutes a false claim, resulting in a large number of contextual false positives. This limits their effectiveness, for example undermining trust in health experts by adding warnings to their posts or resorting to vague warnings instead of granular fact-checks, which result in desensitizing users. In this paper, we propose to incorporate stance detection into existing automated soft-moderation pipelines, with the goal of ruling out contextual false positives and providing more precise recommendations for social media content that should receive warnings. We develop a textual deviation task called Contrastive Textual Deviation (CTD) and show that it outperforms existing stance detection approaches when applied to soft moderation.We then integrate CTD into the stateof-the-art system for automated soft moderation Lambretta, showing that our approach can reduce contextual false positives from 20% to 2.1%, providing another important building block towards deploying reliable automated soft moderation tools on social media.
- Stance detection on social media: State of the art and trends. Information Processing & Management, 58(4), 2021.
- Arastance: A multi-country and multi-domain dataset of arabic stance detection for fact checking. In Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, 2021.
- Adversarial learning for zero-shot stance detection on social media. In Conference of the North American Chapter of the Association for Computational Linguistics, 2021.
- Scaling up fact-checking using the wisdom of crowds. Science advances, 7(36), 2021.
- Anthropic. Introducing Claude. https://www.anthropic.com/index/introducing-claude, 2020.
- Integrating stance detection and fact checking in a unified corpus. In Conference of the North American Chapter of the Association for Computational Linguistics, 2018.
- The pushshift reddit dataset. In AAAI International Conference on Web and Social Media, 2020.
- Robust integration of contextual information for cross-target stance detection. In Joint Conference on Lexical and Computational Semantics (* SEM 2023), 2023.
- Karissa Bell. Instagram adds ’false information’ labels to prevent fake news from going viral. https://mashable.com/article/instagram-false-information-labels, 2019.
- Scibert: A pretrained language model for scientific text. In Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.
- Data set for stance and sentiment analysis from user comments on croatian news. In Workshop on Balto-Slavic Natural Language Processing, 2019.
- Language models are few-shot learners. Advances in neural information processing systems, 33, 2020.
- Michael Burnham. Stance detection with supervised, zero-shot, and few-shot applications. 2023.
- Seeing things from a different angle: Discovering diverse perspectives about claims. In Conference of the North American Chapter of the Association for Computational Linguistics, 2019.
- Scaling instruction-finetuned language models. 2022.
- Climate-fever: A dataset for verification of real-world climate claims. 2020.
- Can rumour stance alone predict veracity? In International conference on computational linguistics, 2018.
- Open information extraction from the web. Communications of the ACM, 51(12), 2008.
- Unsupervised whatsapp fake news detection using semantic search. In International Conference on Intelligent Computing and Control Systems (ICICCS). IEEE, 2020.
- Google. Fact check (claimreview) structured data. https://developers.google.com/search/docs/appearance/structured-data/factcheck, 2023.
- Twitter and facebook race to label a slew of posts making false election claims before all votes counted. https://www.cnbc.com/2020/11/04/twitter-and-facebook-label-trump-posts-claiming-election-stolen.html, 2020.
- Ukp-athene: Multi-sentence textual entailment for claim verification. EMNLP 2018, 2018.
- A survey on stance detection for mis-and disinformation identification. In NAACL, 2022.
- Covidlies: Detecting covid-19 misinformation on social media. In Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, 2020.
- Lora: Low-rank adaptation of large language models. 2021.
- Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models. 2023.
- Measuring, characterizing, and detecting facebook like farms. ACM Transactions on Privacy and Security (TOPS), 2017.
- Zero-shot stance detection via multi-perspective contrastive learning with unlabeled data. Information Processing & Management, 60(4), 2023.
- Automatic sarcasm detection: A survey. ACM Computing Surveys (CSUR), 50(5), 2017.
- The menlo report: Ethical principles guiding information and communication technology research. Available at SSRN 2445102, 2012.
- Paige Leskin. Twitter has apologized for slapping a COVID-19 label on tweets about 5G, but experts say the platform’s algorithm could be encouraging the spread of conspiracy theories. https://www.businessinsider.com/twitter-5g-coronavirus-label-blames-algorithm-encourages-conspiracy-theories-2020-6, 2020.
- “learn the facts about covid-19”: Analyzing the use of warning labels on tiktok videos. In International AAAI Conference on Web and Social Media (ICWSM), 2023.
- Argumentation mining: State of the art and emerging trends. ACM Transactions on Internet Technology (TOIT), 16(2), 2016.
- Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, 35, 2022.
- Multi-task deep neural networks for natural language understanding. In Annual Meeting of the Association for Computational Linguistics, 2019.
- Politics: Pretraining with same-story article comparison for ideology prediction and stance detection. In NAACL, 2022.
- Taylor Lorentz. Twitter labeled factual information about covid-19 as misinformation. https://www.washingtonpost.com/technology/2022/08/25/twitter-factual-covid-info-labeled-misinformation/, 2022.
- Detecting stance in media on global warming. In EMNLP, 2020.
- James Manyika. An overview of Bard: an early experiment with generative AI. https://ai.google/static/documents/google-about-bard.pdf, 2023.
- Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In Annual Meeting of the Association for Computational Linguistics, 2019.
- Mary L McHugh. Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 2012.
- Rethinking the role of demonstrations: What makes in-context learning work? In Conference on Empirical Methods in Natural Language Processing, 2022.
- " this is fake news": Characterizing the spontaneous debunking from twitter users to covid-19 false information. In International AAAI Conference on Web and Social Media, volume 17, 2023.
- Semeval-2016 task 6: Detecting stance in tweets. In International workshop on semantic evaluation (SemEval-2016), 2016.
- The emerging science of content labeling: Contextualizing social media content moderation. Journal of the Association for Information Science and Technology, 73(10), 2022.
- A stance data set on polarized conversations on twitter about the efficacy of hydroxychloroquine as a treatment for covid-19. Data in brief, 33, 2020.
- Fakta: An automatic end-to-end fact checking system. 2019.
- Lynnette Hui Xian Ng and Kathleen M Carley. Is my stance the same as your stance? a cross validation study of stance detection datasets. Information Processing & Management, 59(6), 2022.
- An interpretable joint graphical model for fact-checking from crowds. In AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. In ACL, 2022.
- Poised: Spotting twitter spam off the beaten paths. In ACM SIGSAC Conference on Computer and Communications Security, 2017.
- Adversarial robustness of prompt-based few-shot learning for natural language understanding. 2023.
- OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
- Lambretta: Learning to rank for twitter soft moderation. In IEEE Symposium on Security and Privacy, 2023.
- Fighting misinformation on social media using crowdsourced judgments of news source quality. Proceedings of the National Academy of Sciences, 116(7), 2019.
- English intermediate-task training improves zero-shot cross-lingual transfer too. 2020.
- Fake news challenge stage 1 (fnc-i): Stance detection. 15, 2017.
- Credeye: A credibility lens for analyzing and explaining misinformation. In Companion Proceedings of the The Web Conference 2018, 2018.
- Intermediate-task transfer learning with pretrained language models: When and why does it work? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
- Zero-shot text classification with generative language models. 2019.
- Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019.
- From humor recognition to irony detection: The figurative language of social media. Data & Knowledge Engineering, 74, 2012.
- Exploring the limits of transfer learning with a unified text-to-text transformer. 2019.
- A primer in bertology: What we know about how bert works. Transactions of the Association for Computational Linguistics, 8, 2021.
- Sebastian Ruder. Recent Advances in Language Model Fine-tuning. http://ruder.io/recent-advances-lm-fine-tuning, 2021.
- Covid-fact: Fact extraction and verification of real-world claims on covid-19 pandemic. In Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021.
- Trollmagnifier: Detecting state-sponsored troll accounts on reddit. In IEEE Symposium on Security and Privacy (SP), 2022.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. 2019.
- Bloom: A 176b-parameter open-access multilingual language model. 2022.
- Stance detection benchmark: How robust is your stance detection? KI-Künstliche Intelligenz, 2021.
- Facenet: A unified embedding for face recognition and clustering. In IEEE conference on computer vision and pattern recognition, 2015.
- Hoaxy: A platform for tracking online misinformation. In Proceedings of the international conference companion on world wide web, 2016.
- Misinformation warning labels: Twitter’s soft moderation effects on covid-19 vaccine belief echoes. 2021.
- Propaganda, hate speech, violence: The working lives of facebook’s content moderators. https://www.npr.org/2019/03/02/699663284/the-working-lives-of-facebooks-content-moderators, 2019.
- Fever: a large-scale dataset for fact extraction and verification. In Conference of the North American Chapter of the Association for Computational Linguistics, 2018.
- Fine-tuning large neural language models for biomedical natural language processing. Patterns (New York, NY), 4(4), 2023.
- Fact or fiction: Verifying scientific claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
- Glue: A multi-task benchmark and analysis platform for natural language understanding. In EMNLP Workshop BlackboxNLP, 2018.
- Finetuned language models are zero-shot learners. 2021.
- Emergent abilities of large language models. 2022.
- Polylm: An open source polyglot large language model. 2023.
- Queenie Wong. More harm than good? Twitter struggles to label misleading COVID-19 tweets. https://www.cnet.com/tech/mobile/more-harm-than-good-twitter-struggles-to-label-misleading-covid-19-tweets/, 2020.
- Savvas Zannettou. I won the election: An empirical analysis of soft moderation interventions on twitter. In International AAAI Conference on Web and Social Media, volume 15, 2021.
- Hierarchical clustering with hard-batch triplet loss for person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020.
- Stanceosaurus: Classifying stance towards multicultural misinformation. In Conference on Empirical Methods in Natural Language Processing, 2022.
- Detection and resolution of rumours in social media: A survey. ACM Computing Surveys (CSUR), 51(2), 2018.
- Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one, 11(3), 2016.
- Pujan Paudel (9 papers)
- Mohammad Hammas Saeed (4 papers)
- Rebecca Auger (1 paper)
- Chris Wells (7 papers)
- Gianluca Stringhini (77 papers)