Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cybersecurity Misinformation Detection on Social Media: Case Studies on Phishing Reports and Zoom's Threats (2110.12296v3)

Published 23 Oct 2021 in cs.CY, cs.CR, and cs.SI

Abstract: Prior work has extensively studied misinformation related to news, politics, and health, however, misinformation can also be about technological topics. While less controversial, such misinformation can severely impact companies' reputations and revenues, and users' online experiences. Recently, social media has also been increasingly used as a novel source of knowledgebase for extracting timely and relevant security threats, which are fed to the threat intelligence systems for better performance. However, with possible campaigns spreading false security threats, these systems can become vulnerable to poisoning attacks. In this work, we proposed novel approaches for detecting misinformation about cybersecurity and privacy threats on social media, focusing on two topics with different types of misinformation: phishing websites and Zoom's security & privacy threats. We developed a framework for detecting inaccurate phishing claims on Twitter. Using this framework, we could label about 9% of URLs and 22% of phishing reports as misinformation. We also proposed another framework for detecting misinformation related to Zoom's security and privacy threats on multiple platforms. Our classifiers showed great performance with more than 98% accuracy. Employing these classifiers on the posts from Facebook, Instagram, Reddit, and Twitter, we found respectively that about 18%, 3%, 4%, and 3% of posts were misinformation. In addition, we studied the characteristics of misinformation posts, their authors, and their timelines, which helped us identify campaigns.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. 2020. COVID-19 lockdowns. https://en.wikipedia.org/wiki/COVID-19˙lockdowns, (accessed on 10/12/21).
  2. 2020. PhishTank. https://www.phishtank.com/faq.php, (accessed on 12/17/20).
  3. 2020. Twitter Developer. https://developer.twitter.com/en, (accessed on 06/13/20).
  4. 2020. VirusTotal API. https://developers.virustotal.com/reference, (accessed on 01/13/21).
  5. 2021. IntelMQ. http://github.com/certtools/intelmq/, (accessed on 08/13/21).
  6. 2021. SpiderFoot, Open Source Intelligence Automation. http://spiderfoot.net/, (accessed on 08/13/21).
  7. 2021a. Twitter API v2 support. https://developer.twitter.com/en/support/twitter-api/v2, (accessed on 05/15/22).
  8. 2021b. Twitter V2 API. https://developer.twitter.com/en/docs/twitter-api/tweets/search/introduction, (accessed on 05/01/22).
  9. PhishAri: Automatic realtime phishing detection on twitter. In 2012 eCrime Researchers Summit, 1–12. IEEE.
  10. Processing tweets for cybersecurity threat awareness. Information Systems 95: 101586.
  11. Barrett, B. 2020. Zoom Finally Has End-to-End Encryption. Here’s How to Use It. https://www.wired.com/story/how-to-enable-zoom-encryption/, (accessed on 02/14/21).
  12. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008(10): P10008.
  13. Types, sources, and claims of COVID-19 misinformation. Reuters Institute 7: 3–1.
  14. Brettman, A. 2020. Software Flaws Sometimes First Reported on Social Media. https://www.pnnl.gov/news-media/software-flaws-sometimes-first-reported-social-media, (accessed on 12/17/20).
  15. Prophiler: a fast filter for the large-scale detection of malicious web pages. In Proceedings of the 20th international conference on World Wide Web, 197–206.
  16. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16: 321–357.
  17. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In 2015 IEEE international conference on communications (ICC), 7065–7070. IEEE.
  18. Cohen, J. 2021. Verified Twitter Users Shared an All-Time-High Amount of Fake News in 2020. https://www.pcmag.com/news/verified-twitter-users-shared-an-all-time-high-amount-of-fake-news-in-2020, (accessed on 05/14/22).
  19. Grounded theory research: Procedures, canons, and evaluative criteria. Qualitative sociology 13(1): 3–21.
  20. Tweet sentiment analysis with classifier ensembles. Decision support systems 66: 170–179.
  21. Why phishing works. In Proceedings of the SIGCHI conference on Human Factors in computing systems, 581–590.
  22. Fraser, L. 2021. What data is CrowdTangle tracking? https://help.crowdtangle.com/en/articles/1140930-what-data-is-crowdtangle-tracking, (accessed on 01/17/21).
  23. Discovery of grounded theory: Strategies for qualitative research. Routledge.
  24. Goodman, L. A. 1961. Snowball sampling. The annals of mathematical statistics 148–170.
  25. Harwell, D. 2020. Thousands of Zoom video calls left exposed on open Web. https://www.washingtonpost.com/technology/2020/04/03/thousands-zoom-video-calls-left-exposed-open-web/, (accessed on 10/09/21).
  26. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), 1322–1328. IEEE.
  27. Unsupervised content-based identification of fake news articles with tensor decomposition ensembles. In Proceedings of the Workshop on Misinformation and Misbehavior Mining on the Web (MIS2).
  28. Better malware ground truth: Techniques for weighting anti-virus vendor labels. In Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, 45–56.
  29. Multi-source multi-class fake news detection. In Proceedings of the 27th international conference on computational linguistics, 1546–1557.
  30. Learning Hierarchical Discourse-level Structure for Fake News Detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 3432–3442.
  31. The Virus Changed the Way We Internet. https://www.nytimes.com/interactive/2020/04/07/technology/coronavirus-internet-use.html, (accessed on 12/10/20).
  32. Coronavirus goes viral: quantifying the COVID-19 misinformation epidemic on Twitter. Cureus 12(3).
  33. False information on web and social media: A survey. arXiv preprint arXiv:1804.08559 .
  34. Larson, S. 2021. As Delta Variant Spreads, COVID-19 Themes Make Resurgence In Email Threats. https://www.proofpoint.com/us/blog/threat-insight/delta-variant-spreads-covid-19-themes-make-resurgence-email-threats, (accessed on 05/04/22).
  35. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nature human behaviour 5(3): 337–348.
  36. The parallel pandemic: Medical misinformation and COVID-19: Primum non nocere. Journal of general internal medicine 35: 2435–2436.
  37. Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 1245–1254.
  38. Linguistic traces of a scientific fraud: The case of Diederik Stapel. PloS one 9(8): e105937.
  39. McCarthy, B. 2020. PolitiFact. https://www.politifact.com/factchecks/2020/apr/07/charlie-kirk/china-spying-you-through-zoom-charlie-kirk-oversta/, (accessed on 04/14/21).
  40. Tweeting is believing? Understanding microblog credibility perceptions. In Proceedings of the ACM 2012 conference on computer supported cooperative work, 441–450.
  41. Predicting cyber attacks with bayesian networks using unconventional signals. In Proceedings of the 12th Annual Conference on Cyber and Information Security Research, 1–4.
  42. Fighting an infodemic: Covid-19 fake news dataset. In International Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation, 21–29. Springer.
  43. Opening the blackbox of virustotal: Analyzing online phishing scan engines. In Proceedings of the Internet Measurement Conference, 478–485.
  44. Detecting phishing attacks using natural language processing and machine learning. In 2018 ieee 12th international conference on semantic computing (icsc), 300–301. IEEE.
  45. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 conference on empirical methods in natural language processing, 2931–2937.
  46. Redden, E. 2020. ‘Zoombombing’ Attacks Disrupt Classes. https://www.insidehighered.com/news/2020/03/26/zoombombers-disrupt-online-classes-racist-pornographic-content, (accessed on 02/10/21).
  47. Ethical research standards in a world of big data. F1000Research 3.
  48. Evaluating the effectiveness of Phishing Reports on Twitter. In 2021 APWG Symposium on Electronic Crime Research (eCrime), 1–13. IEEE.
  49. Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 797–806.
  50. Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting {{\{{Real-World}}\}} Exploits. In 24th USENIX Security Symposium (USENIX Security 15), 1041–1056.
  51. Early warnings of cyber threats in online discussions. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), 667–674. IEEE.
  52. Detection of novel social bots by ensembles of specialized classifiers. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2725–2732.
  53. Schuster, C. 2004. A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educational and Psychological Measurement 64(2): 243–253.
  54. Seals, T. 2019. Zoom Zero-Day Bug Opens Mac Users to Webcam Hijacking. https://threatpost.com/zoom-zero-day-mac-webcam-hijacking/146317/, (accessed on 10/11/21).
  55. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19(1): 22–36.
  56. Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining, 312–320.
  57. A first look at COVID-19 information and misinformation sharing on Twitter. arXiv preprint arXiv:2003.13907 .
  58. Automating URL blacklist generation with similarity search approach. IEICE TRANSACTIONS on Information and Systems 99(4): 873–882.
  59. Team, C. 2020. CrowdTangle. Facebook, Menlo Park, California, United States. https://www.crowdtangle.com/ (accessed on 01/15/21).
  60. ’Cure or Poison?’Identity Verification and the Spread of Fake News on Social Media. Identity Verification and the Spread of Fake News on Social Media (September 14, 2018). Fox School of Business Research Paper (18-040).
  61. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, 849–857.
  62. Yuan, E. S. 2020. Zoom’s Use of Facebook’s SDK in iOS Client. https://blog.zoom.us/zoom-use-of-facebook-sdk-in-ios-client/, (accessed on 06/21/21).
  63. SAFE: Similarity-Aware Multi-modal Fake News Detection. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 354–367. Springer.
  64. Measuring and Modeling the Label Dynamics of Online Anti-Malware Engines. In 29th {normal-{\{{USENIX}normal-}\}} Security Symposium ({normal-{\{{USENIX}normal-}\}} Security 20).
  65. Tweet, but verify: epistemic study of information verification on twitter. Social Network Analysis and Mining 4(1): 1–12.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mohit Singhal (7 papers)
  2. Nihal Kumarswamy (3 papers)
  3. Shreyasi Kinhekar (1 paper)
  4. Shirin Nilizadeh (32 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.