Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A comparison of online search engine autocompletion in Google and Baidu (2405.01917v1)

Published 3 May 2024 in cs.CY

Abstract: Warning: This paper contains content that may be offensive or upsetting. Online search engine auto-completions make it faster for users to search and access information. However, they also have the potential to reinforce and promote stereotypes and negative opinions about a variety of social groups. We study the characteristics of search auto-completions in two different linguistic and cultural contexts: Baidu and Google. We find differences between the two search engines in the way they suppress or modify original queries, and we highlight a concerning presence of negative suggestions across all social groups. Our study highlights the need for more refined, culturally sensitive moderation strategies in current language technologies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Allport, G. W. 1979. The Nature of Prejudice. Unabridg. Reading: Perseus Books.
  2. Azzopardi, L. 2021. Cognitive Biases in Search: A Review and Reflection of Cognitive Biases in Information Retrieval. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval, CHIIR ’21, 27–37. New York, NY, USA: Association for Computing Machinery. ISBN 9781450380553.
  3. ‘Why do white people have thin lips?’Google and the perpetuation of stereotypes via auto-complete search forms. Critical discourse studies, 10(2): 187–204.
  4. A survey of query auto completion in information retrieval. Foundations and Trends® in Information Retrieval, 10(4): 273–363.
  5. Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you? In Moens, M.-F.; Huang, X.; Specia, L.; and Yih, S. W.-t., eds., Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 1477–1491. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  6. Toxicity in chatgpt: Analyzing persona-assigned language models. In Bouamor, H.; Pino, J.; and Bali, K., eds., Findings of the Association for Computational Linguistics: EMNLP 2023, 1236–1270. Singapore: Association for Computational Linguistics.
  7. Disclosure and Mitigation of Gender Bias in LLMs. arXiv preprint arXiv:2402.11190.
  8. Bias Identification in Language Models is Biased. In Workshop on Algorithmic Injustice.
  9. Elliott, D. E. 1965. INTERROGATION IN ENGLISH AND MANDARIN CHINESE.
  10. Has ceo gender bias really been fixed? adversarial attacking and improving gender fairness in image search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 11882–11890.
  11. Efficient and effective query auto-completion. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2271–2280.
  12. On the Social and Technical Challenges of Web Search Autosuggestion Moderation. First Monday, 27.
  13. Jiang, M. 2014. The business and politics of search engines: A comparative study of Baidu and Google’s search results of Internet events in China. New media & society, 16(2): 212–233.
  14. Examining Autocompletion as a Basic Concept for Interaction with Generative AI. i-com, 19(3): 251–264.
  15. Which Stereotypes Are Moderated and Under-Moderated in Search Engine Autocompletion? In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 1049–1061.
  16. Trapped in the search box: An examination of algorithmic bias in search engine autocomplete predictions. Telematics and Informatics, 85: 102068.
  17. Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search. In Bouamor, H.; Pino, J.; and Bali, K., eds., Findings of the Association for Computational Linguistics: EMNLP 2023, 1211–1225. Singapore: Association for Computational Linguistics.
  18. Mehdi, Y. 2023. Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web. Official Microsoft Blog, 7.
  19. Responsible epistemic technologies: A social-epistemological analysis of autocompleted web search. New Media & Society, 19(12): 1945–1963.
  20. Nrc emotion lexicon. National Research Council, Canada, 2: 234.
  21. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5356–5371.
  22. Biases in Large Language Models: Origins, Inventory, and Discussion. J. Data and Information Quality, 15(2).
  23. When are search completion suggestions problematic? Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2): 1–25.
  24. Reaching the Gold Standard: Automated Text Analysis with Generative Pre-trained Transformers Matches Human-Level Performance.
  25. GPT is an effective tool for multilingual psychological text analysis. PsyArXiv.
  26. Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages. In Proceedings of the 2018 World Wide Web Conference, WWW ’18, 955–965. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee. ISBN 9781450356398.
  27. Rogers, R. 2023. Algorithmic probing: Prompting offensive Google results and their moderation. Big Data & Society, 10(1): 20539517231176228.
  28. Sarkar, D. 2024. Navigating the Knowledge Sea: Planet-scale answer retrieval using LLMs. arXiv preprint arXiv:2402.05318.
  29. Sullivan, D. 2018. How Google autocomplete works in Search. Retrieved November, 22: 2018.
  30. Sullivan, D. 2020. How Google autocomplete predictions are generated. Retrieved October, 8: 2020.
  31. Game of Missuggestions: Semantic Analysis of Search-Autocomplete Manipulations. In NDSS.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Geng Liu (16 papers)
  2. Pietro Pinoli (1 paper)
  3. Stefano Ceri (17 papers)
  4. Francesco Pierri (44 papers)
Citations (2)
X Twitter Logo Streamline Icon: https://streamlinehq.com