Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

$Q_{bias}$ -- A Dataset on Media Bias in Search Queries and Query Suggestions (2311.17780v1)

Published 29 Nov 2023 in cs.IR

Abstract: This publication describes the motivation and generation of $Q_{bias}$, a large dataset of Google and Bing search queries, a scraping tool and dataset for biased news articles, as well as LLMs for the investigation of bias in online search. Web search engines are a major factor and trusted source in information search, especially in the political domain. However, biased information can influence opinion formation and lead to biased opinions. To interact with search engines, users formulate search queries and interact with search query suggestions provided by the search engines. A lack of datasets on search queries inhibits research on the subject. We use $Q_{bias}$ to evaluate different approaches to fine-tuning transformer-based LLMs with the goal of producing models capable of biasing text with left and right political stance. Additionally to this work we provided datasets and LLMs for biasing texts that allow further research on bias in online information search.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. AllSides. 2021. How AllSides Creates Balanced News: A Step-by-Step Guide. Retrieved Nov 30, 2022 from https://www.allsides.com/blog/how-does-allsides-create-balanced-news
  2. AllSides. 2022. Balanced News Headlines Roundup. Retrieved Nov 30, 2022 from https://www.allsides.com/unbiased-balanced-news
  3. Construction of Domain-Specific DistilBERT Model by Using Fine-Tuning. In TAAI. 237–241.
  4. We Can Detect Your Bias: Predicting the Political Ideology of News Articles. In EMNLP. 4982–4991.
  5. Michael Barbaro and Tom Zeller. 2006. A Face is exposed for AOL searcher no. 4417749. New York Times (01 2006).
  6. Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. IPM 37 (05 2001), 403–434.
  7. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In NIPS. 4356–4364.
  8. An investigation of biases in web search engine query suggestions. OIR 44, 2 (2019), 365–381.
  9. Analyzing ELMo and DistilBERT on Socio-political News Classification. In AESPEN.
  10. Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. FNTIR 10, 4 (2016), 273–363.
  11. Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity. In NLPCSS. 149–154. https://doi.org/10.18653/v1/2020.nlpcss-1.16
  12. Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educ Psychol Meas 20, 1 (1960), 37–46.
  13. Dave D’Alessio and Mike Allen. 2000. Media Bias in Presidential Elections: A Meta-Analysis. J. Commun. 50, 4 (2000), 133–156.
  14. Edelman. 2022. 2022 Edelman Trust Barometer. Retrieved Nov 30, 2022 from https://www.edelman.com/trust/2022-trust-barometer
  15. Robert Epstein and Ronald E. Robertson. 2015. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections. PNAS 112, 33, E4512–E4521. Publisher: National Academy of Sciences Section: PNAS Plus.
  16. Matthew Gault. 2022. AI Trained on 4Chan Becomes ‘Hate Speech Machine’. Retrieved Feb 28, 2023 from https://www.vice.com/en/article/7k8zwx/ai-trained-on-4chan-becomes-hate-speech-machine
  17. Bertram Gawronski. 2021. Partisan bias in the identification of fake news. TiCS 25, 9 (2021), 723–724.
  18. Fabian Haak and Philipp Schaer. 2021. Perception-Aware Bias Detection for Query Suggestions. In BIAS. 130–142.
  19. Fabian Haak and Philipp Schaer. 2022. Auditing Search Query Suggestion Bias Through Recursive Algorithm Interrogation. In WebSci. 219–227.
  20. A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries.. In TPDL (Lecture Notes in Computer Science, Vol. 6966), Stefan Gradmann, Francesca Borri, Carlo Meghini, and Heiko Schuldt (Eds.). Springer, 192–203. http://dblp.uni-trier.de/db/conf/ercimdl/tpdl2011.html#HienertSSM11
  21. Christoph Hube and Besnik Fetahu. 2019. Neural Based Statement Classification for Biased Language. In WSDM. ACM.
  22. L. Introna and H. Nissenbaum. 2000. Defining the Web: the politics of search engines. Computer 33, 1, 54–62.
  23. Search bias quantification: investigating political bias in social media and web search. Inf. Retr. J. 22, 1, 188–227.
  24. A Transformer-based Framework for Neutralizing and Reversing the Political Polarity of News Articles. Proc. ACM Hum.-Comput. Interact. 5, 1–26.
  25. On user interactions with query auto-completion. In SIGIR. 1055–1058.
  26. Moral Framing and Ideological Bias of News. In SocInfo. 206–219.
  27. On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines. In ICLR. https://openreview.net/forum?id=nzpLWnVAyah
  28. Xi Niu and Diane Kelly. 2014. The use of query suggestions during information search. IPM 50, 1, 218–234.
  29. A Picture of Search. In InfoScale. 1–es.
  30. Lily Ray. 2020. 2020 Google Search Survey: How Much Do Users Trust Their Search Results? Retrieved Nov 30, 2022 from https://moz.com/blog/2020-google-search-survey
  31. Dbias: Detecting biases and ensuring Fairness in news articles. Int J Data Sci Anal (2022).
  32. Linguistic Models for Analyzing and Detecting Biased Language. In ACL. 1650–1659.
  33. Reddit. 2022. This is the worst AI ever. Retrieved Feb 28, 2023 from https://www.reddit.com/r/MachineLearning/comments/v42pej/p_this_is_the_worst_ai_ever_gpt4chan_model/
  34. Auditing Partisan Audience Bias within Google Search. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 148 (2018).
  35. Auditing Autocomplete: Suggestion Networks and Recursive Algorithm Interrogation. In WebSci. 235–244.
  36. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019).
  37. Danny Sullivan. 2018. How Google autocomplete works in Search. Retrieved Nov 30, 2022 from https://blog.google/products/search/how-google-autocomplete-works-search/
  38. Game of Missuggestions: Semantic Analysis of Search-Autocomplete Manipulations. In NDSS.
  39. Transformers: State-of-the-Art Natural Language Processing. In EMNLP. 38–45.
  40. Jinxi Xu and W. Bruce Croft. 2000. Improving the Effectiveness of Information Retrieval with Local Context Analysis. ACM Trans. Inf. Syst. 18, 1 (Jan. 2000), 79–112. https://doi.org/10.1145/333135.333138
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Fabian Haak (5 papers)
  2. Philipp Schaer (63 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.