Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 45 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 206 tok/s Pro
GPT OSS 120B 457 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Designing and Testing a Mobile Application for Collecting WhatsApp Chat Data While Preserving Privacy (2401.15221v1)

Published 26 Jan 2024 in cs.HC

Abstract: It is common practice for researchers to join public WhatsApp chats and scrape their contents for analysis. However, research shows collecting data this way contradicts user expectations and preferences, even if the data is effectively public. To overcome these issues, we outline design considerations for collecting WhatsApp chat data with improved user privacy by heightening user control and oversight of data collection and taking care to minimize the data researchers collect and process off a user's device. We refer to these design principles as User-Centered Data Sharing (UCDS). To evaluate our UCDS principles, we implemented a mobile application representing one possible instance of these improved data collection techniques and evaluated the viability of using the app to collect WhatsApp chat data. Second, we surveyed WhatsApp users to gather user perceptions on common existing WhatsApp data collection methods as well as UCDS methods. Our results show that we were able to glean similar informative insights into WhatsApp chats using UCDS principles in our prototype app to common, less privacy-preserving methods. Our survey showed that methods following the UCDS principles are preferred by users because they offered users more control over the data collection process. Future user studies could further expand upon UCDS principles to overcome complications of researcher-to-group communication in research on WhatsApp chats and evaluate these principles in other data sharing contexts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. Chinmayi Arun. On WhatsApp, Rumours, Lynchings, and the Indian Government. SSRN Scholarly Paper ID 3336127, Social Science Research Network, Rochester, NY, January 2019. https://papers.ssrn.com/abstract=3336127.
  2. Daniel Avelar. WhatsApp fake news during Brazil election ‘favoured Bolsonaro’. The Guardian, October 2019. https://www.theguardian.com/world/2019/oct/30/whatsapp-fake-news-brazil-election-favoured-jair-bolsonaro-analysis-suggests.
  3. Shashank Bengali. How WhatsApp is battling misinformation in India, where ‘fake news is part of our culture’. Los Angeles Times, February 2019. https://www.latimes.com/world/la-fg-india-whatsapp-2019-story.html.
  4. Operationalizing the legal principle of data minimization for personalization. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pages 399–408, 2020. https://dl.acm.org/doi/abs/10.1145/3397271.3401034.
  5. WhatsApp Blog. WhatsApp 2.0 is submitted, October 2019. https://blog.whatsapp.com/whats-app-2-0-is-submitted/?lang=en.
  6. The Ethics of Web Crawling and Web Scraping in Cybercrime Research: Navigating Issues of Consent, Privacy, and Other Potential Harms Associated with Automated Data Collection. In Anita Lavorgna and Thomas J. Holt, editors, Researching Cybercrimes: Methodologies, Ethics, and Critical Approaches, pages 435–456. Springer International Publishing, Cham, 2021. https://doi.org/10.1007/978-3-030-74837-1_22.
  7. Internet Research Ethics. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, summer 2021 edition, 2021. https://plato.stanford.edu/archives/sum2021/entries/ethics-internet-research/.
  8. UC Bureau. Disclosure Avoidance for the 2020 Census: An Introduction, 2021. https://www2.census.gov/library/publications/decennial/2020/2020-census-disclosure-avoidance-handbook.pdf.
  9. Ann Cavoukian. Privacy by design. 2009. https://privacysecurityacademy.com/wp-content/uploads/2020/08/PbD-Principles-and-Mapping.pdf.
  10. L. Ceci. Share of adults in the United States who have a WhatsApp account as of May 2022, by age group, 2022. https://www.statista.com/statistics/814649/whatsapp-users-in-the-united-states-by-age/.
  11. Amit Chowdhry. WhatsApp Is Now Rolling Out Video Calling For iPhone, Android And Windows Phone. Forbes, November 2016. https://www.forbes.com/sites/amitchowdhry/2016/11/15/whatsapp-video-calling-launches/.
  12. On the Ethics of Using Publicly-Available Data. In Marié Hattingh, Machdel Matthee, Hanlie Smuts, Ilias Pappas, Yogesh K. Dwivedi, and Matti Mäntymäki, editors, Responsible Design, Implementation and Use of Information and Communication Technology, Lecture Notes in Computer Science, pages 159–171, Cham, 2020. Springer International Publishing. https://link.springer.com/chapter/10.1007/978-3-030-45002-1_14.
  13. Can WhatsApp Counter Misinformation by Limiting Message Forwarding? In Complex Networks and Their Applications VIII: Volume 1 Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019 8, pages 372–384. Springer, 2020. https://link.springer.com/chapter/10.1007/978-3-030-36687-2_31.
  14. Shruti Dhapola. WhatsApp and end-to-end encryption: Here’s everything you need to know. The Indian Express, August 2017. https://indianexpress.com/article/technology/social/whatsapp-and-its-end-to-end-encryption-everything-you-ever-wanted-to-know-4807191/.
  15. Investigating How University Students in the United States Encounter and Deal With Misinformation in Private WhatsApp Chats During COVID-19. In Eighteenth Symposium on Usable Privacy and Security (SOUPS 2022), pages 427–446, 2022. https://www.usenix.org/conference/soups2022/presentation/feng.
  16. “Participant” Perceptions of Twitter Research Ethics. Social Media + Society, 4(1):2056305118763366, January 2018. https://doi.org/10.1177/2056305118763366.
  17. Images and misinformation in political groups: Evidence from WhatsApp in India. Harvard Kennedy School Misinformation Review, July 2020. https://misinforeview.hks.harvard.edu/article/images-and-misinformation-in-political-groups-evidence-from-whatsapp-in-india/.
  18. WhatsApp, Doc? A First Look at WhatsApp Public Group Data. In Twelfth international AAAI conference on Web and Social Media, 2018. http://www.eecs.qmul.ac.uk/~tysong/files/whatsapp18.pdf.
  19. Measuring Americans’ Comfort With Research Uses of Their Social Media Data. Social Media + Society, 7(3):20563051211033824, 2021. https://journals.sagepub.com/doi/pdf/10.1177/20563051211033824.
  20. Robots Welcome? Ethical and Legal Considerations for Web Crawling and Scraping. Washington Journal of Law, Technology & Arts, 13(3):275, April 2018. https://digitalcommons.law.uw.edu/wjlta/vol13/iss3/4.
  21. Kalle Grill. Liberalism, Altruism and Group Consent. Public Health Ethics, 2(2):146–157, 2009. https://academic.oup.com/phe/article/2/2/146/1541306.
  22. All the numbers are us: Large-scale abuse of contact discovery in mobile messengers. Cryptology ePrint Archive, Paper 2020/1119, 2020. https://eprint.iacr.org/2020/1119.
  23. Taylor Hatmaker. WhatsApp photos and videos can now disappear after a single viewing. TechCrunch, August 2021. https://techcrunch.com/2021/08/03/whatsapp-view-once-disappearing-photos-video/.
  24. Whatsapp and Nigeria’s 2019 Elections : Mobilising the People, Protecting the Vote, July 2019. https://www.africaportal.org/publications/whatsapp-and-nigerias-2019-elections-mobilising-people-protecting-vote/.
  25. Victoria Ho. Voice Messaging Comes To Whatsapp. TechCrunch, October 2013. https://social.techcrunch.com/2013/08/07/voice-messaging-comes-to-whatsapp/.
  26. Chris Hoffman. How to Eliminate SMS Fees and Text for Free. How-To Geek, June 2013. https://www.howtogeek.com/164395/how-to-eliminate-sms-fees-and-text-for-free/.
  27. A First Look at COVID-19 Messages on WhatsApp in Pakistan. In 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 118–125, December 2020. https://ieeexplore.ieee.org/abstract/document/9381360.
  28. Towards Conviviality in Navigating Health Information on Social Media. In CHI Conference on Human Factors in Computing Systems, pages 1–14, New Orleans LA USA, April 2022. ACM. https://dl.acm.org/doi/10.1145/3491102.3517622.
  29. Information Disorder in Asia and the Pacific: Overview of Misinformation Ecosystem in Australia, India, Indonesia, Japan, the Philippines, Singapore, South Korea, Taiwan, and Vietnam. SSRN Scholarly Paper ID 3134581, Social Science Research Network, Rochester, NY, October 2018. https://papers.ssrn.com/abstract=3134581.
  30. Tiplines to Combat Misinformation on Encrypted Platforms: A Case Study of the 2019 Indian Election on WhatsApp. arXiv:2106.04726 [cs], July 2021. https://arxiv.org/abs/2106.04726.
  31. Legality and ethics of web scraping. 2018. https://www.researchgate.net/profile/Vlad-Krotov/publication/324907302_Legality_and_Ethics_of_Web_Scraping/links/5aea622345851588dd8287dc/Legality-and-Ethics-of-Web-Scraping.pdf.
  32. Identifying Harmful Media in End-to-End Encrypted Communication: Efficient Private Membership Computation. In 30th USENIX Security Symposium (USENIX Security 21), pages 893–910, 2021. https://www.usenix.org/system/files/sec21-kulshrestha.pdf.
  33. A Primer on Theory-Driven Web Scraping: Automatic Extraction of Big Data From the Internet for Use in Psychological Research. Psychological Methods, 21, May 2016. https://doi.org/10.1037/met0000081.
  34. Algorithmic thinking in the public interest: navigating technical, legal, and ethical hurdles to web scraping in the social sciences. Quality & Quantity, May 2021. https://doi.org/10.1007/s11135-021-01164-0.
  35. A Study of Misinformation in WhatsApp groups with a focus on the Brazilian Presidential Elections. In Companion Proceedings of The 2019 World Wide Web Conference, WWW ’19, pages 1013–1019, New York, NY, USA, May 2019. Association for Computing Machinery. https://doi.org/10.1145/3308560.3316738.
  36. What You Can Scrape and What Is Right to Scrape: A Proposal for a Tool to Collect Public Facebook Data. Social Media + Society, 6(3):2056305120940703, July 2020. https://doi.org/10.1177/2056305120940703.
  37. Analyzing the Use of Audio Messages in WhatsApp Groups. In Proceedings of The Web Conference 2020, WWW ’20, pages 3005–3011, New York, NY, USA, April 2020. Association for Computing Machinery. https://doi.org/10.1145/3366423.3380070.
  38. Do Automated Legal Threats Reduce Freedom of Expression Online? Preliminary Results from a Natural Experiment, 2020.
  39. WhatsApp Monitor: A Fact-Checking System for WhatsApp. Proceedings of the International AAAI Conference on Web and Social Media, 13:676–677, July 2019. https://ojs.aaai.org/index.php/ICWSM/article/view/3271.
  40. Cade Metz. Forget Apple vs. the FBI: WhatsApp Just Switched on Encryption for a Billion People. Wired, April 2016. https://www.wired.com/2016/04/forget-apple-vs-fbi-whatsapp-just-switched-encryption-billion-people/.
  41. The Ethics of Using Social Media in Fisheries Research. Reviews in Fisheries Science & Aquaculture, 26(2):235–242, April 2018. https://doi.org/10.1080/23308249.2017.1389854.
  42. WhatsApp for Monitoring and Response during Critical Events: Aggie in the Ghana 2016 Election. In 14th International Conference on Information Systems for Crisis Response and Management, 2017. http://collections.unu.edu/view/UNU:6189.
  43. News and information over Facebook and WhatsApp during the Indian election campaign. Data Memo, 2, 2019. https://demtech.oii.ox.ac.uk/research/posts/news-and-information-over-facebook-and-whatsapp-during-the-indian-election-campaign/.
  44. Meta Newsroom. Two Billion Users — Connecting the World Privately, February 2020. https://about.fb.com/news/2020/02/two-billion-users/.
  45. Helen Nissenbaum. Privacy as contextual integrity. Wash. L. Rev., 79:119, 2004. https://nyuscholars.nyu.edu/en/publications/privacy-as-contextual-integrity.
  46. Helen Nissenbaum. Contextual integrity up and down the data food chain. Theoretical Inquiries in Law, 20(1):221–256, 2019. https://www.degruyter.com/document/doi/10.1515/til-2019-0008/html?lang=de.
  47. US Department of Health and Human Services and others. National commission for the protection of human subjects of biomedical and behavioral research. The Belmont report: Ethical principles and guidelines for the protection of human subjects of research, 1979.
  48. Parmy Olson. Exclusive: The Rags-To-Riches Tale Of How Jan Koum Built WhatsApp Into Facebook’s New $19 Billion Baby. Forbes, February 2014. https://www.forbes.com/sites/parmyolson/2014/02/19/exclusive-inside-story-how-jan-koum-built-whatsapp-into-facebooks-new-19-billion-baby/.
  49. No Humans Here: Ethical Speculation on Public Data, Unintended Consequences, and the Limits of Institutional Review. Proceedings of the ACM on Human-Computer Interaction, 6:1–13, 2022. https://dl.acm.org/doi/abs/10.1145/3492857.
  50. Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology, 70:153–163, May 2017. https://www.sciencedirect.com/science/article/pii/S0022103116303201.
  51. Sarah Perez. WhatsApp adds support for document sharing, but only PDFs at launch. TechCrunch, March 2016. https://techcrunch.com/2016/03/02/whatsapp-adds-support-for-document-sharing-but-only-pdfs-at-launch/.
  52. Prolific. Quickly find research participants you can trust., 2022. https://www.prolific.co/.
  53. Kunal Purohit. Misinformation, fake news spark India coronavirus fears. Al Jazeera, March 2020. https://www.aljazeera.com/news/2020/3/10/misinformation-fake-news-spark-india-coronavirus-fears.
  54. How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk, Web, and Telephone Samples. In 2019 IEEE Symposium on Security and Privacy (SP), pages 1326–1343, May 2019. https://ieeexplore.ieee.org/abstract/document/8835345.
  55. A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian and Indian Elections. arXiv:2005.02443 [cs], May 2020. http://arxiv.org/abs/2005.02443.
  56. Scraping the Web for Public Health Gains: Ethical Considerations from a ‘Big Data’ Research Project on HIV and Incarceration. Public Health Ethics, 13(1):111–121, March 2020. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7392638/.
  57. Analyzing Textual (Mis)Information Shared in WhatsApp Groups. In Proceedings of the 10th ACM Conference on Web Science, WebSci ’19, pages 225–234, New York, NY, USA, June 2019. Association for Computing Machinery. https://doi.org/10.1145/3292522.3326029.
  58. (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. In The World Wide Web Conference, pages 818–828, 2019. https://dl.acm.org/doi/abs/10.1145/3308558.3313688.
  59. ”Short is the Road that Leads from Fear to Hate”: Fear Speech in Indian WhatsApp Groups. In Proceedings of the Web Conference 2021, WWW ’21, pages 1110–1121, New York, NY, USA, April 2021. Association for Computing Machinery. https://doi.org/10.1145/3442381.3450137.
  60. Paul Sawers. Three-quarters of WhatsApp users are on Android, 22% on iOS (study). VentureBeat, August 2015. https://venturebeat.com/2015/08/27/three-quarters-of-whatsapp-users-are-on-android-study-finds/.
  61. Brian Schrag. Research with groups: Group rights, group consent, and collaborative research. Science and Engineering Ethics, 12(3):511–521, 2006. https://bioethics.yale.edu/sites/default/files/files/fulltext.pdf.
  62. Whatsanalyzer: A tool for collecting and analyzing whatsapp mobile messaging communication data. In 2018 30th International Teletraffic Congress (ITC 30), volume 1, pages 85–88. IEEE, 2018. https://ieeexplore.ieee.org/document/8493058.
  63. Group-based communication in whatsapp. In 2016 IFIP Networking Conference (IFIP Networking) and Workshops, pages 536–541, 2016. https://ieeexplore.ieee.org/abstract/document/7497256.
  64. Tools for Countering Misinformation on Encrypted Chat Apps. TTO, 2019. https://truthandtrustonline.com/wp-content/uploads/2019/09/paper_26.pdf.
  65. ”It Matches My Worldview”: Examining Perceptions and Attitudes Around Fake Videos. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22, pages 1–15, New York, NY, USA, April 2022. Association for Computing Machinery. http://doi.org/10.1145/3491102.3517646.
  66. Ian Shaw. Ethics and the Practice of Qualitative Research. Qualitative social work, 7(4):400–414, 2008. https://journals.sagepub.com/doi/pdf/10.1177/1473325008097137.
  67. Statista. Mobile messenger and communication apps, 2021. https://www.statista.com/study/15257/mobile-messenger-apps-statista-dossier/.
  68. Statista. WhatsApp, 2021. https://www.statista.com/study/20494/whatsapp-statista-dossier/.
  69. Christina Tardáguila. ‘What’s Crap on WhatsApp?’ has debunked 25 hoaxes in 6 episodes. What is the challenge now? Poynter, January 2020. https://www.poynter.org/fact-checking/2020/whats-crap-on-whatsapp-has-debunked-25-hoaxes-in-6-episodes-what-is-the-challenge/.
  70. Prolific Team. What are the advantages and limitations of an online sample?, 2022. https://researcher-help.prolific.co/hc/en-gb/articles/360009501473-What-are-the-advantages-and-limitations-of-an-online-sample-.
  71. Mayowa Tijani. How to spot COVID-19 misinformation on WhatsApp. Fact Check, April 2020. https://factcheck.afp.com/how-spot-covid-19-misinformation-whatsapp.
  72. Dataethics: Principles and Guidelines for Companies, Authorities & Organisations. DataEthics. eu, 2018. https://dataethics.eu/wp-content/uploads/Dataethics-uk.pdf.
  73. Alexia Tsotsis. WhatsApp Was Valued At ~$1.5B In Final Round Before Sale. TechCrunch, February 2014. https://social.techcrunch.com/2014/02/21/whatsapp/.
  74. Designing privacy-by-design. In Privacy Technologies and Policy: First Annual Privacy Forum, APF 2012, Limassol, Cyprus, October 10-11, 2012, Revised Selected Papers 1, pages 55–72. Springer, 2014. https://link.springer.com/chapter/10.1007/978-3-642-54069-1_4.
  75. Collecting Facebook Posts and WhatsApp Chats. In Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Brno, Czech Republic, September 12-16, 2016, Proceedings 19, pages 249–258. Springer, 2016. https://link.springer.com/chapter/10.1007/978-3-319-45510-5_29.
  76. James Vincent. WhatsApp finally adds voice calls for all Android users, iOS coming soon. The Verge, March 2015. https://www.theverge.com/2015/3/31/8318821/whatsapp-voice-calls-android-ios.
  77. Ethics Regulation in Social Computing Research: Examining the Role of Institutional Review Boards. Journal of Empirical Research on Human Research Ethics, 12(5):372–382, December 2017. https://doi.org/10.1177/1556264617725200.
  78. Beyond the belmont principles: Ethical challenges, practices, and beliefs in the online data research community. In Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing, pages 941–953, 2016. https://dl.acm.org/doi/abs/10.1145/2818048.2820078.
  79. What’s Crap on WhatsApp?, 2022. https://www.whatscrap.africa/#home-section.
  80. WhatsApp Group Links, 2022. https://www.whatsapgrouplinks.com/.
  81. Qualtrics XM. Experience Management Software, 2022. https://www.qualtrics.com/.
  82. Michael Zimmer. “But the data is already public”: on the ethics of research in Facebook. Ethics and Information Technology, 12(4):313–325, December 2010. https://doi.org/10.1007/s10676-010-9227-5.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets