Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bias of AI-Generated Content: An Examination of News Produced by Large Language Models (2309.09825v3)

Published 18 Sep 2023 in cs.AI

Abstract: LLMs have the potential to transform our lives and work through the content they generate, known as AI-Generated Content (AIGC). To harness this transformation, we need to understand the limitations of LLMs. Here, we investigate the bias of AIGC produced by seven representative LLMs, including ChatGPT and LLaMA. We collect news articles from The New York Times and Reuters, both known for their dedication to provide unbiased news. We then apply each examined LLM to generate news content with headlines of these news articles as prompts, and evaluate the gender and racial biases of the AIGC produced by the LLM by comparing the AIGC and the original news articles. We further analyze the gender bias of each LLM under biased prompts by adding gender-biased messages to prompts constructed from these news headlines. Our study reveals that the AIGC produced by each examined LLM demonstrates substantial gender and racial biases. Moreover, the AIGC generated by each LLM exhibits notable discrimination against females and individuals of the Black race. Among the LLMs, the AIGC generated by ChatGPT demonstrates the lowest level of bias, and ChatGPT is the sole model capable of declining content generation when provided with biased prompts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Ouyang, L. et al. Training language models to follow instructions with human feedback. \JournalTitleAdvances in Neural Information Processing Systems 35, 27730–27744 (2022).
  2. Touvron, H. et al. Llama: Open and efficient foundation language models. \JournalTitlearXiv preprint arXiv:2302.13971 (2023).
  3. Li, F.-F. et al. Generative ai: Perspectives from stanford hai. \JournalTitleStanford HAI Report (2023).
  4. Bias in computer systems. \JournalTitleACM Transactions on Information Systems (TOIS) 14, 330–347 (1996).
  5. Guglielmi, G. Gender bias goes away when grant reviewers focus on the science. \JournalTitleNature 554, 14–16 (2018).
  6. Dissecting racial bias in an algorithm used to manage the health of populations. \JournalTitleScience 366, 447–453 (2019).
  7. The reduction of race and gender bias in clinical treatment recommendations using clinician peer networks in an experimental setting. \JournalTitleNature communications 12, 6585 (2021).
  8. Algorithmic bias in education. \JournalTitleInternational Journal of Artificial Intelligence in Education 1–41 (2021).
  9. Gender composition predicts gender bias: A meta-reanalysis of hiring discrimination audit experiments. \JournalTitleScience Advances 9, eade7979 (2023).
  10. How stereotypes are shared through language: a review and introduction of the aocial categories and stereotypes communication (scsc) framework. \JournalTitleReview of Communication Research 7, 1–37 (2019).
  11. Liang, P. et al. Holistic evaluation of language models. \JournalTitlearXiv preprint arXiv:2211.09110 (2022).
  12. Contrasting Linguistic Patterns in Human and LLM-Generated Text, DOI: 10.48550/arXiv.2308.09067 (2023). ArXiv:2308.09067 [cs].
  13. How Generative AI Is Changing Creative Work. \JournalTitleHarvard Business Review (2022). Section: Business and society.
  14. Automated journalism as a source of and a diagnostic device for bias in reporting. \JournalTitleMedia and Communication 8, 39–49 (2020).
  15. The woman worked as a babysitter: On biases in language generation. \JournalTitlearXiv preprint arXiv:1909.01326 (2019).
  16. Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. \JournalTitlearXiv preprint arXiv:1903.03862 (2019).
  17. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 610–623 (2021).
  18. Huang, P.-S. et al. Reducing sentiment bias in language models via counterfactual evaluation. \JournalTitlearXiv preprint arXiv:1911.03064 (2019).
  19. Stereoset: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5356–5371 (2021).
  20. Towards understanding and mitigating social biases in language models. In International Conference on Machine Learning, 6565–6576 (PMLR, 2021).
  21. Kirk, H. R. et al. Bias out-of-the-box: An empirical analysis of intersectional occupational biases in popular generative language models. \JournalTitleAdvances in neural information processing systems 34, 2611–2624 (2021).
  22. The psychology of fake news. \JournalTitleTrends in cognitive sciences 25, 388–402 (2021).
  23. Automated identification of media bias in news articles: an interdisciplinary literature review. \JournalTitleInternational Journal on Digital Libraries 20, 391–415, DOI: 10.1007/s00799-018-0261-y (2019).
  24. A Large-Scale Test of Gender Bias in the Media. \JournalTitleSociological Science 6, 526–550, DOI: 10.15195/v6.a20 (2019).
  25. Hannabuss, S. The study of news. \JournalTitleLibrary management (1995).
  26. Zellers, R. et al. Defending against neural fake news. \JournalTitleAdvances in neural information processing systems 32 (2019).
  27. Improving language understanding by generative pre-training. \JournalTitleOpenAI (2018).
  28. The earth mover’s distance as a metric for image retrieval. \JournalTitleInternational journal of computer vision 40, 99 (2000).
  29. The earth mover’s distance is the mallows distance: Some insights from statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 2, 251–256 (IEEE, 2001).
  30. Gender bias in ai: A review of contributing factors and mitigating strategies. \JournalTitleACIS 2020 Proceedings (2020).
  31. Mitigating gender bias in machine learning data sets. In Bias and Social Aspects in Search and Recommendation: First International Workshop, BIAS 2020, Lisbon, Portugal, April 14, Proceedings 1, 12–26 (Springer, 2020).
  32. Sun, T. et al. Mitigating gender bias in natural language processing: Literature review. \JournalTitleAssociation for Computational Linguistics (ACL 2019) (2019).
  33. The effect of publishing peer review reports on referee behavior in five scholarly journals. \JournalTitleNature communications 10, 322 (2019).
  34. Sentiments analysis of fmri using automatically generated stimuli labels under naturalistic paradigm. \JournalTitleScientific Reports 13, 7267 (2023).
  35. An efficient technique of predicting toxicity on music lyrics machine learning. In 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE), 1–5 (IEEE, 2023).
  36. How ai is learning to identify toxic online content. \JournalTitleScientific American 8 (2021).
  37. The Evolution of Topic Modeling. \JournalTitleACM Computing Surveys 54, 215:1–215:35 (2022).
  38. Latent dirichlet allocation. \JournalTitleJournal of Machine Learning Research 3, 993–1022.
  39. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 45–50 (ELRA, Valletta, Malta, 2010).
  40. Agresti. An Introduction to Categorical Data Analysis (Wiley), 3rd edn.
  41. Sharpe, D. Chi-square test is statistically significant: Now what? \JournalTitlePractical Assessment, Research, and Evaluation 20.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiao Fang (90 papers)
  2. Shangkun Che (2 papers)
  3. Minjia Mao (4 papers)
  4. Hongzhe Zhang (5 papers)
  5. Ming Zhao (106 papers)
  6. Xiaohang Zhao (5 papers)
Citations (44)