Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Into the crossfire: evaluating the use of a language model to crowdsource gun violence reports (2401.12989v1)

Published 16 Jan 2024 in cs.CL and cs.IR

Abstract: Gun violence is a pressing and growing human rights issue that affects nearly every dimension of the social fabric, from healthcare and education to psychology and the economy. Reliable data on firearm events is paramount to developing more effective public policy and emergency responses. However, the lack of comprehensive databases and the risks of in-person surveys prevent human rights organizations from collecting needed data in most countries. Here, we partner with a Brazilian human rights organization to conduct a systematic evaluation of LLMs to assist with monitoring real-world firearm events from social media data. We propose a fine-tuned BERT-based model trained on Twitter (now X) texts to distinguish gun violence reports from ordinary Portuguese texts. Our model achieves a high AUC score of 0.97. We then incorporate our model into a web application and test it in a live intervention. We study and interview Brazilian analysts who continuously fact-check social media texts to identify new gun violence events. Qualitative assessments show that our solution helped all analysts use their time more efficiently and expanded their search capacities. Quantitative assessments show that the use of our model was associated with more analysts' interactions with online users reporting gun violence. Taken together, our findings suggest that modern Natural Language Processing techniques can help support the work of human rights organizations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. An NLP-Powered Human Rights Monitoring Platform. Expert Systems with Applications 153 (2020). https://doi.org/10.1016/j.eswa.2020.113365
  2. Towards a Corpus of Violence Acts in Arabic Social Media. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association, 1627–1631. https://aclanthology.org/L16-1257
  3. Self-Training: A Survey. https://doi.org/10.48550/arXiv.2202.12040
  4. Overview of DA-VINCIS at IberLEF 2022: Detection of Aggressive and Violent Incidents from Social Media in Spanish. Procesamiento del Lenguaje Natural 69 (2022). https://doi.org/10.26342/2022-69-18
  5. NLP in Human Rights Research: Extracting Knowledge Graphs about Police and Army Units and Their Commanders. In Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022. European Language Resources Association, 62–69. https://aclanthology.org/2022.law-1.7
  6. Ignacio Cano. 2013. Violence and organized crime in brazil: The case of “militias” in rio de janeiro. In Transnational Organized Crime. Transcript Verlag, 179–210. https://www.jstor.org/stable/j.ctv1fxh0d.16
  7. Ann Marie Clark and Kathryn Sikkink. 2013. Information Effects and Human Rights Data: Is the Good News About Increased Human Rights Information Bad News for Human Rights Measures? Human Rights Quarterly 35, 3 (2013), 539–568. https://www.jstor.org/stable/24518073
  8. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://doi.org/10.48550/arXiv.1810.04805
  9. Ragini Gokhale and Maria Fasli. 2017. Deploying a co-training algorithm to classify human-rights abuses. In 2017 International Conference on the Frontiers and Advances in Data Science (FADS). 108–113. https://doi.org/10.1109/FADS.2017.8253206
  10. Michael Goodhart. 2016. Human Rights: Politics and Practice. Oxford University Press. https://doi.org/10.1093/hepl/9780198708766.001.0001
  11. Machine Learning Human Rights and Wrongs: How the Successes and Failures of Supervised Learning Algorithms Can Inform the Debate About Information Effects. Political Analysis 27, 2 (2019), 223–230. https://doi.org/10.1017/pan.2018.11
  12. Daniel Hirata and Maria Isabel Couto. 2022. Mapa Histórico dos Grupos Armados no Rio de Janeiro. https://geni.uff.br/2022/09/13/mapa-historico-dos-grupos-armados-no-rio-de-janeiro/
  13. Daniel Hirata and Carolina Christoph Grillo. 2019. Roubos, proteção patrimonial e letalidade no Rio de Janeiro. https://geni.uff.br/2021/03/26/roubos-protecao-patrimonial-e-letalidade-no-rio-de-janeiro/
  14. Chacinas Policiais no Rio de Janeiro: Estatização das mortes, mega chacinas policiais e impunidade. https://geni.uff.br/2023/05/05/chacinas-policiais-no-rio-de-janeiro-estatizacao-das-mortes-mega-chacinas-policiais-e-impunidade/
  15. ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 5469–5482. https://doi.org/10.18653/v1/2022.naacl-main.400
  16. Jonathan Kolieb and Marta Poblet. 2018. Responding to Human Rights Abuses in the Digital Era: New Tools, Old Challenges. Stanford Journal of International Law 52, 2 (2018). https://papers.ssrn.com/abstract=3859873
  17. Computational social science: Obstacles and opportunities. Science 369, 6507 (2020), 1060–1062. https://doi.org/10.1126/science.aaz8170
  18. Julita Lemgruber. 2022. Tiros no futuro: Impactos da guerra às drogas na rede municipal de educação do Rio de Janeiro. https://cesecseguranca.com.br/textodownload/tiros-no-futuro-impactos-da-guerra-as-drogas-na-rede-municipal-de-educacao-do-rio-de-janeiro/
  19. Digging into human rights violations: Data modelling and collective memory. In 2013 IEEE International Conference on Big Data. 37–45. https://doi.org/10.1109/BigData.2013.6691668
  20. Amanda M. Murdie and K. Anne Watson. 2021. Quantitative Human Rights. Oxford Research Encyclopedia of International Studies (2021). https://doi.org/10.1093/acrefore/9780190846626.013.603
  21. United Nations. 2013. Human Rights Indicators: A Guide to Measurement and Implementation. https://doi.org/10.18356/58576336-en
  22. Detecting Human Rights Violations on Social Media during Russia-Ukraine War. https://doi.org/10.48550/arXiv.2306.05370
  23. The Global Burden of Disease 2016 Injury Collaborators. 2018. Global Mortality From Firearms, 1990-2016. JAMA 320, 8 (2018), 792. https://doi.org/10.1001/jama.2018.10060
  24. Global burden and trends of firearm violence in 204 countries/territories from 1990 to 2019. Frontiers in Public Health 10 (2022). https://doi.org/10.3389/fpubh.2022.966507
  25. Ellie Pavlick and Chris Callison-Burch. 2016. The gun violence database. In Presented at the Data For Good Exchange 2016. https://doi.org/10.48550/arXiv.1610.01670
  26. Detecting Violation of Human Rights via Social Media. In Proceedings of the First Computing Social Responsibility Workshop within the 13th Language Resources and Evaluation Conference. European Language Resources Association, 40–45. https://aclanthology.org/2022.csrnlp-1.6
  27. Megan Price and Patrick Ball. 2015. The Limits of Observation for Understanding Mass Violence. Canadian Journal of Law and Society / La Revue Canadienne Droit et Société 30, 2 (2015), 237–257. https://doi.org/10.1017/cls.2015.24
  28. Introducing ACLED: An armed conflict location and event dataset. Journal of peace research 47, 5 (2010), 651–660. https://doi.org/10.1177/0022343310378914
  29. A New Task and Dataset on Detecting Attacks on Human Rights Defenders. https://doi.org/10.48550/arXiv.2306.17695
  30. Gretchen B. Rossman and Sharon F. Rallis. 2017. An Introduction to Qualitative Research: Learning in the Field. SAGE Publications. https://doi.org/10.4135/9781071802694
  31. “No meio do fogo cruzado”: reflexões sobre os impactos da violência armada na Atenção Primária em Saúde no município do Rio de Janeiro. Ciência & Saúde Coletiva 26 (2021), 2109–2118. https://doi.org/10.1590/1413-81232021266.00632021
  32. BERTimbau: Pretrained BERT Models for Brazilian Portuguese. In Intelligent Systems (Lecture Notes in Computer Science), Ricardo Cerri and Ronaldo C. Prati (Eds.). Springer International Publishing, 403–417. https://doi.org/10.1007/978-3-030-61377-8_28
  33. Attention is all you need. In Advances in neural information processing systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  34. The brWaC corpus: a new open resource for Brazilian Portuguese. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). European Language Resources Association. https://aclanthology.org/L18-1686

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com