Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Computational Assessment of Hyperpartisanship in News Titles (2301.06270v2)

Published 16 Jan 2023 in cs.CL

Abstract: We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection with 2,200 manually labeled and 1.8 million machine-labeled titles that were posted from 2014 to the present by nine representative media organizations across three media bias groups - Left, Central, and Right in an active learning manner. A fine-tuned transformer-based LLM achieves an overall accuracy of 0.84 and an F1 score of 0.78 on an external validation set. Next, we conduct a computational analysis to quantify the extent and dynamics of partisanship in news titles. While some aspects are as expected, our study reveals new or nuanced differences between the three media groups. We find that overall the Right media tends to use proportionally more hyperpartisan titles. Roughly around the 2016 Presidential Election, the proportions of hyperpartisan titles increased across all media bias groups, with the Left media exhibiting the most significant relative increase. We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles using logistic regression models and the Shapley values. Through an analysis of the topic distribution, we find that societal issues gradually gain more attention from all media groups. We further apply a lexicon-based language analysis tool to the titles of each topic and quantify the linguistic distance between any pairs of the three media groups, uncovering three distinct patterns.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Exposure to ideologically diverse news and opinion on Facebook. Science, 348(6239): 1130–1132.
  2. Natural language processing with Python: analyzing text with the natural language toolkit. ” O’Reilly Media, Inc.”.
  3. Selection bias in news coverage: learning it, fighting it. In Companion Proceedings of the The Web Conference 2018, 535–543.
  4. Fair and balanced? Quantifying media bias through crowdsourced content analysis. Public Opinion Quarterly, 80(S1): 250–271.
  5. Media exposure and exemplar accessibility. Media Psychology, 5(3): 255–282.
  6. Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration. Proceedings of the National Academy of Sciences, 119(31): e2120510119.
  7. Special issue on learning from imbalanced data sets. ACM SIGKDD explorations newsletter, 6(1): 1–6.
  8. Fine-grained analysis of the use of neutral and controversial terms for COVID-19 on social media. In International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, 57–67. Springer.
  9. Political polarization on twitter. In Proceedings of the international aaai conference on web and social media, volume 5, 89–96.
  10. Media bias in presidential elections: A meta-analysis. Journal of communication, 50(4): 133–156.
  11. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Burstein, J.; Doran, C.; and Solorio, T., eds., Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 4171–4186. Association for Computational Linguistics.
  12. Dor, D. 2003. On newspaper headlines as relevance optimizers. Journal of pragmatics, 35(5): 695–721.
  13. Culture war. The myth of a polarized America, 3.
  14. FORCE11. 2020. The FAIR Data principles. https://force11.org/info/the-fair-data-principles/. Accessed: 2024-04-03.
  15. Datasheets for datasets. Communications of the ACM, 64(12): 86–92.
  16. What drives media slant? Evidence from US daily newspapers. Econometrica, 78(1): 35–71.
  17. The consequences of online partisan media. Proceedings of the National Academy of Sciences, 118(14): e2013464118.
  18. Japkowicz, N.; et al. 2000. Learning from imbalanced data sets: a comparison of various strategies. In AAAI workshop on learning from imbalanced data sets, volume 68, 10–15. AAAI Press Menlo Park, CA.
  19. SemEval-2019 Task 4: Hyperpartisan News Detection. In Proceedings of the 13th International Workshop on Semantic Evaluation, 829–839. Minneapolis, Minnesota, USA: Association for Computational Linguistics.
  20. Adam: A Method for Stochastic Optimization. arXiv:1412.6980.
  21. Krumpal, I. 2013. Determinants of social desirability bias in sensitive surveys: a literature review. Quality & quantity, 47(4): 2025–2047.
  22. The measurement of observer agreement for categorical data. biometrics, 159–174.
  23. Does media coverage of partisan polarization affect political attitudes? Political Communication, 33(2): 283–301.
  24. More voices than ever? quantifying media bias in networks. In Proceedings of the International AAAI Conference on Web and Social Media, volume 5, 193–200.
  25. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692.
  26. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
  27. Understanding Political Polarization via Jointly Modeling Users, Connections and Multimodal Contents on Heterogeneous Graphs. In Proceedings of the 30th ACM International Conference on Multimedia, 4072–4082.
  28. Social media study of public opinions on potential COVID-19 vaccines: informing dissent, disparities, and dissemination. Intelligent medicine, 2(01): 1–12.
  29. Violence originated from Facebook: A case study in Bangladesh. arXiv:1804.11241.
  30. Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict. Political Analysis, 16(4): 372–403.
  31. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
  32. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, 71(2001): 2001.
  33. A Stylometric Inquiry into Hyperpartisan and Fake News. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 231–240. Melbourne, Australia: Association for Computational Linguistics.
  34. Hyperpartisanship, disinformation and political conversations on Twitter: The Brazilian presidential election of 2018. In Proceedings of the international AAAI conference on Web and social media, volume 14, 569–578.
  35. Gensim–python framework for vector space modelling. NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic, 3(2).
  36. nEmesis: Which restaurants should you avoid today? In First AAAI Conference on Human Computation and Crowdsourcing.
  37. Crowdsourced Measure of News Articles Bias: Assessing Contributors’ Reliability. In SAD/CrowdBias@ HCOMP, 1–10.
  38. Quantifying social organization and political polarization in online platforms. Nature, 600(7888): 264–268.
  39. Political Bias and Factualness in News Sharing across more than 100,000 Online Communities. arXiv:2102.08537.
  40. Why are “others” so polarized? Perceived political polarization and media use in 10 countries. Journal of Computer-Mediated Communication, 21(5): 349–367.
  41. Monitoring depression trends on twitter during the COVID-19 pandemic: observational study. JMIR infodemiology, 1(1): e26769.
  42. A Fine-Grained Analysis of Public Opinion toward Chinese Technology Companies on Reddit. In 2023 IEEE International Conference on Big Data (BigData), 5951–5959. Los Alamitos, CA, USA: IEEE Computer Society.
  43. Zillmann, D. 1999. Exemplification theory: Judging the whole by some of its parts. Media psychology, 1(1): 69–94.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com