Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Username Squatting on Online Social Networks: A Study on X (2401.09209v2)

Published 17 Jan 2024 in cs.CR and cs.SI

Abstract: Adversaries have been targeting unique identifiers to launch typo-squatting, mobile app squatting and even voice squatting attacks. Anecdotal evidence suggest that online social networks (OSNs) are also plagued with accounts that use similar usernames. This can be confusing to users but can also be exploited by adversaries. However, to date no study characterizes this problem on OSNs. In this work, we define the username squatting problem and design the first multi-faceted measurement study to characterize it on X. We develop a username generation tool (UsernameCrazy) to help us analyze hundreds of thousands of username variants derived from celebrity accounts. Our study reveals that thousands of squatted usernames have been suspended by X, while tens of thousands that still exist on the network are likely bots. Out of these, a large number share similar profile pictures and profile names to the original account signalling impersonation attempts. We found that squatted accounts are being mentioned by mistake in tweets hundreds of thousands of times and are even being prioritized in searches by the network's search recommendation algorithm exacerbating the negative impact squatted accounts can have in OSNs. We use our insights and take the first step to address this issue by designing a framework (SQUAD) that combines UsernameCrazy with a new classifier to efficiently detect suspicious squatted accounts. Our evaluation of SQUAD's prototype implementation shows that it can achieve 94% F1-score when trained on a small dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (123)
  1. “2022 Strengthened Code of Practice on Disinformation” Accessed: 2022-12-01, https://digital-strategy.ec.europa.eu/en/library/2022-strengthened-code-practice-disinformation, 2022
  2. Josh Aas “Let’s Encrypt: The CA’s Role in Fighting Phishing and Malware” Accessed: 2023-06-01, 2015 URL: https://letsencrypt.org/2015/10/29/phishing-and-malware.html
  3. “Recognizing human behaviours in online social networks” In Comput. Secur. 74, 2018, pp. 355–370
  4. Saeideh Bakhshi, David A Shamma and Eric Gilbert “Faces engage us: Photos with faces attract more likes and comments on instagram” In Proceedings of the SIGCHI conference on human factors in computing systems, 2014, pp. 965–974
  5. Christoph Besel, Juan Echeverria and Shi Zhou “Full Cycle Analysis of a Large-Scale Botnet Attack on Twitter” In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2018, pp. 170–177 DOI: 10.1109/ASONAM.2018.8508708
  6. “All your contacts are belong to us: Automated identity theft attacks on social networks” In Proceedings of the 18th International Conference on World Wide Web, 2009, pp. 551–560 DOI: 10.1145/1526709.1526784
  7. Steven Bird, Ewan Klein and Edward Loper “Natural language processing with Python: analyzing text with the natural language toolkit” ” O’Reilly Media, Inc.”, 2009
  8. Social Blade “Top 100 Most Followed Twitter Accounts” Accessed: 2023-12-01, https://socialblade.com/twitter/top/100
  9. Thomas Bohm “Letter and symbol misrecognition in highly legible typefaces for general, children, dyslexic, visually impaired and ageing readers” In Information Design Journal 21, 2014 DOI: 10.1075/idj.21.1.05boh
  10. “Simultaneously Removing Noise and Selecting Relevant Features for High Dimensional Noisy Data” In 2008 Seventh International Conference on Machine Learning and Applications, 2008, pp. 147–152 DOI: 10.1109/ICMLA.2008.87
  11. “VGGFace2: A Dataset for Recognising Faces across Pose and Age” In 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), 2018, pp. 67–74 DOI: 10.1109/FG.2018.00020
  12. Nikan Chavoshi, Hossein Hamooni and Abdullah Mueen “Temporal Patterns in Bot Activities”, WWW ’17 Companion Perth, Australia: International World Wide Web Conferences Steering Committee, 2017, pp. 1601–1606 DOI: 10.1145/3041021.3051114
  13. “SMOTE: Synthetic Minority Over-sampling Technique” In J. Artif. Intell. Res. (JAIR) 16, 2002, pp. 321–357 DOI: 10.1613/jair.953
  14. Xue-wen Chen and Jong Cheol Jeong “Enhanced recursive feature elimination” In Sixth International Conference on Machine Learning and Applications (ICMLA 2007), 2007, pp. 429–435 DOI: 10.1109/ICMLA.2007.35
  15. Graham Cluley “How Twitter users can fake a verified account” Accessed: 2023-04-01, https://nakedsecurity.sophos.com/2013/01/17/twitter-fake-verified-account/, 2013
  16. “Support-vector networks” In Machine learning 20.3 Springer, 1995, pp. 273–297
  17. David R Cox “The regression analysis of binary sequences” In Journal of the Royal Statistical Society: Series B (Methodological) 20.2 Wiley Online Library, 1958, pp. 215–232
  18. “Large-Scale Analysis of Pop-Up Scam on Typosquatting URLs” In Proceedings of the 14th International Conference on Availability, Reliability and Security, ARES ’19 Canterbury, CA, United Kingdom: Association for Computing Machinery, 2019 DOI: 10.1145/3339252.3340332
  19. Fred J. Damerau “A Technique for Computer Detection and Correction of Spelling Errors” In Commun. ACM 7.3 New York, NY, USA: Association for Computing Machinery, 1964, pp. 171–176 DOI: 10.1145/363958.363994
  20. “SybilInfer: Detecting Sybil Nodes using Social Networks” In NDSS, 2009
  21. Ashish Dangwal “‘No Weapon Sales To Israel’: How A Lockheed Martin ‘Tweet’ Resulted In A Loss Of Billions Of Dollars To US Defense Giant” Accessed: 2022-12-01, https://eurasiantimes.com/no-weapons-sales-to-israel-how-a-lockheed-martin-tweet-resulted/, 2022
  22. “The Relationship between Precision-Recall and ROC Curves” In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06 Pittsburgh, Pennsylvania, USA: Association for Computing Machinery, 2006, pp. 233–240 DOI: 10.1145/1143844.1143874
  23. Rocco De Nicola, Marinella Petrocchi and Manuel Pratelli “On the efficacy of old features for the detection of new bots” In Information Processing & Management 58, 2021, pp. 102685 DOI: 10.1016/j.ipm.2021.102685
  24. AnHai Doan, Alon Halevy and Zachary Ives “4 - String Matching” In Principles of Data Integration Boston: Morgan Kaufmann, 2012, pp. 95–119 DOI: https://doi.org/10.1016/B978-0-12-416044-6.00004-1
  25. “Elon Musk said Twitter has seen a ‘massive drop in revenue’ as more brands pause ads” Accessed: 2022-12-01, https://www.edition.cnn.com/2022/11/04/tech/twitter-advertisers/index.html, 2022
  26. Ahmed ElAzab “Fake accounts detection in twitter based on minimum weighted feature” In World, 2016
  27. “Elon Musk’s Twitter lays off employees across the company” Accessed: 2022-12-01, https://edition.cnn.com/2022/11/03/tech/twitter-layoffs/index.html, 2022
  28. External Data Source “dnstwist” IMPACT, 2018 DOI: 10.23721/100/1504360
  29. “Facebook parent company Meta will lay off 11,000 employees” Accessed: 2022-12-01, https://edition.cnn.com/2022/11/09/tech/meta-facebook-layoffs/index.html, 2022
  30. “Federal Trade Commission” Accessed: 2023-12-01, https://www.ftc.gov/
  31. “Hyperparameter Optimization” In Automated Machine Learning: Methods, Systems, Challenges Cham: Springer International Publishing, 2019, pp. 3–33 DOI: 10.1007/978-3-030-05318-5˙1
  32. “Combating the evolving spammers in online social networks” In Computers & Security 72, 2017 DOI: 10.1016/j.cose.2017.08.014
  33. Allison Gatlin “Eli Lilly Dives After Fake Twitter Account Promises Free Insulin; Takes Novo Nordisk, Sanofi With It” Accessed: 2022-12-01, https://www.investors.com/news/technology/lly-stock-dives-taking-novo-sanofi-with-it-after-fake-twitter-account-promises-free-insulin/, 2022
  34. Priscila A. Gimenes, Norton T. Roman and Ariadne M.B.R. Carvalho “Spelling Error Patterns in Brazilian Portuguese” In Computational Linguistics 41.1, 2015, pp. 175–183 DOI: 10.1162/COLI˙a˙00216
  35. Oana Goga, Giridhari Venkatadri and Krishna P Gummadi “The doppelgänger bot attack: Exploring identity impersonation in online social networks” In Proceedings of the 2015 internet measurement conference, 2015, pp. 141–153
  36. “@spam: the underground on 140 characters or less” In CCS ’10, 2010
  37. “Introduction to artificial neural networks” In European journal of gastroenterology & hepatology 19, 2008, pp. 1046–54 DOI: 10.1097/MEG.0b013e3282f198a0
  38. Nuno Guimaraes, Alvaro Figueira and Luis Torgo “Knowledge-Based Reliability Metrics for Social Media Accounts”, 2020 DOI: 10.5220/0010140403390350
  39. Drew Harwell “A fake tweet sparked panic at Eli Lilly and may have cost Twitter millions” Accessed: 2022-12-01, https://www.washingtonpost.com/technology/2022/11/14/twitter-fake-eli-lilly/, 2022
  40. Tin Kam Ho “Random decision forests” In Proceedings of 3rd international conference on document analysis and recognition 1, 1995, pp. 278–282 IEEE
  41. Kris Holt “How to spot a fake verified Twitter account” Accessed: 2023-04-01, https://www.dailydot.com/unclick/how-to-spot-fake-verified-twitter/, 2013
  42. “Squeeze-and-Excitation Networks”, 2019 arXiv:1709.01507 [cs.CV]
  43. “Mobile App Squatting”, 2020, pp. 1727–1738 DOI: 10.1145/3366423.3380243
  44. “BotSlayer: real-time detection of bot amplification on Twitter” In Journal of Open Source Software 4, 2019, pp. 1706 DOI: 10.21105/joss.01706
  45. “Information Commissioner’s Office” Accessed: 2023-12-01, https://ico.org.uk/
  46. “Internet Archive: Wayback Machine” Accessed: 2021-12-5, https://archive.org/web/
  47. ItalianPostNews “Twitter, from Apple to Tesla the fake tweets with the “blue check” that have become memes” Accessed: 2022-12-01, https://www.italianpost.news/twitter-from-apple-to-tesla-the-fake-tweets-with-the-blue-check-that-have-become-memes/, 2022
  48. Lei Jin, Daniel Takabi and James Joshi “Towards Active Detection of Identity Clone Attacks on Online Social Networks” In CODASPY’11 - Proceedings of the 1st ACM Conference on Data and Application Security and Privacy, 2011, pp. 27–38 DOI: 10.1145/1943513.1943520
  49. “Kaggle Bots dataset” Accessed: 2021-10-14, https://www.kaggle.com/vikasg/russian-troll-tweets, 2017
  50. “Kaggle Fake Account Dataset” Accessed: 2021-10-14, https://www.kaggle.com/bitandatom/social-network-fake-account-dataset
  51. “Kaggle Popular Accounts Dataset” Accessed: 2021-09-30, https://www.kaggle.com/parulpandey/100-mostfollowed-twitter-accounts-as-of-dec2019
  52. “Kaggle spammer dataset” Accessed: 2021-10-14, https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuine-accounts
  53. SE Kelly, I Bourgeault and R Dingwall “The SAGE handbook of qualitative methods in health research” In R. ingwall R. De Vries & I. Bourgeault (Eds.), London: Sage, 2010
  54. “Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse”, 2017 DOI: 10.1145/3133956.3134002
  55. “Detecting social network profile cloning”, 2011, pp. 295–300 DOI: 10.1109/PERCOMW.2011.5766886
  56. “Skill Squatting Attacks on Amazon Alexa” In 27th USENIX Security Symposium (USENIX Security 18) Baltimore, MD: USENIX Association, 2018, pp. 33–47 URL: https://www.usenix.org/conference/usenixsecurity18/presentation/kumar
  57. APSS Lab “SQUAD” In GitHub repository GitHub, https://github.com/APSS-Imperial/SQUAD, 2023
  58. APSS Lab “SQUAD Framework” Google, https://sites.google.com/view/squad-framework/home, 2023
  59. Kyumin Lee, James Caverlee and Steve Webb “Uncovering Social Spammers: Social Honeypots + Machine Learning”, SIGIR ’10 Geneva, Switzerland: Association for Computing Machinery, 2010, pp. 435–442 DOI: 10.1145/1835449.1835522
  60. “WarningBird: A Near Real-Time Detection System for Suspicious URLs in Twitter Stream” In Dependable and Secure Computing, IEEE Transactions on 10, 2013, pp. 183–195 DOI: 10.1109/TDSC.2013.3
  61. Let’s Encrypt “Let’s Encrypt — Free SSL/TLS Certificates” Accessed: 2023-06-01, https://letsencrypt.org, 2017
  62. Steven Loria “textblob Documentation” In Release 0.15 2, 2018
  63. Michal Majka “naivebayes: High Performance Implementation of the Naive Bayes Algorithm in R” R package version 0.9.7, 2019 URL: https://CRAN.R-project.org/package=naivebayes
  64. “Why allowing profile name reuse is a bad idea”, 2016, pp. 1–6 DOI: 10.1145/2905760.2905762
  65. “What’s in a Name? Understanding Profile Name Reuse on Twitter” In Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1161–1170
  66. Mary McHugh “Interrater reliability: The kappa statistic” In Biochemia medica : časopis Hrvatskoga društva medicinskih biokemičara / HDMB 22, 2012, pp. 276–82 DOI: 10.11613/BM.2012.031
  67. Antonio Mucherino, Petraq J. Papajorgji and Panos M. Pardalos “k-Nearest Neighbor Classification” In Data Mining in Agriculture New York, NY: Springer New York, 2009, pp. 83–106 DOI: 10.1007/978-0-387-88615-2˙4
  68. “Soundsquatting: Uncovering the Use of Homophones in Domain Squatting”, 2014, pp. 291–308 DOI: 10.1007/978-3-319-13257-0˙17
  69. “Oberlo” Accessed: 2023-11-01, https://www.oberlo.com/blog/twitter-statistics, 2023
  70. “On Profiling Bots in Social Media”, 2016 DOI: 10.1007/978-3-319-47880-7
  71. “Purposeful sampling for qualitative data collection and analysis in mixed method implementation research” In Administration and policy in mental health and mental health services research 42 Springer, 2015, pp. 533–544
  72. “Scikit-learn: Machine Learning in Python” In Journal of Machine Learning Research 12, 2011, pp. 2825–2830
  73. Jeffrey Pennington, Richard Socher and Christopher Manning “GloVe: Global Vectors for Word Representation” In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) Doha, Qatar: Association for Computational Linguistics, 2014, pp. 1532–1543 DOI: 10.3115/v1/D14-1162
  74. “The False positive problem of automatic bot detection in social science research” In PLoS ONE 15, 2020 DOI: 10.1371/journal.pone.0241045
  75. “Musk Says Apple Cutting Twitter Ads—Here Are The Other Companies Rethinking Their Ties” Accessed: 2022-12-01, https://www.forbes.com/sites/nicholasreimann/2022/11/28/musk-says-apple-cutting-twitter-ads-here-are-the-other-companies-rethinking-their-ties/?sh=5efc41b77032, 2022
  76. Tech Report “The Top 50 Most Popular Followed X / Twitter Accounts” Accessed: 2023-12-01, https://techreport.com/statistics/top-most-followed-x-twitter-accounts/
  77. Ellen Riloff, Siddharth Patwardhan and Janyce Wiebe “Feature subsumption for opinion analysis” In Proceedings of the 2006 conference on empirical methods in natural language processing, 2006, pp. 440–448
  78. “Detection of Novel Social Bots by Ensembles of Specialized Classifiers” In Proceedings of the 29th ACM International Conference on Information & Knowledge Management ACM, 2020 DOI: 10.1145/3340531.3412698
  79. Sivanesh Seelan, K. Kavin and A. Hassan “Frustrate Twitter from automation: How far a user can be trusted?” In 2013 International Conference on Human Computer Interactions, ICHCI 2013, 2013, pp. 1–5 DOI: 10.1109/ICHCI-IEEE.2013.6887787
  80. “The spread of fake news by social bots”, 2017
  81. Gaurav Sood “virustotal: R Client for the virustotal API” R package version 0.2.2, 2021
  82. Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna “Detecting Spammers on Social Networks” In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC New York, NY, USA: ACM, 2010, pp. 1–9 DOI: 10.1145/1920261.1920263
  83. Sysomos.com “Inside Twitter: An In-Depth Look Inside the Twitter World” Accessed: 2023-06-01, https://www.key4biz.it/files/000270/00027033.pdf
  84. “Email Typosquatting”, 2017 DOI: 10.1145/3131365.3131399
  85. “The Long ”Taile” of Typosquatting Domain Names”, 2014
  86. “Characterizing Social Bots Spreading Financial Disinformation”, 2020, pp. 376–392 DOI: 10.1007/978-3-030-49570-1˙26
  87. “Facing Reciprocity: How Photos and Avatars Promote Interaction in Micro-communities” In Group Decision and Negotiation 32.2 Springer, 2023, pp. 435–467
  88. “The Digital Services Act package” Accessed: 2022-12-01, https://digital-strategy.ec.europa.eu/en/policies/digital-services-act-package, 2022
  89. “Design and Evaluation of a Real-Time URL Spam Filtering Service” In Proceedings - IEEE Symposium on Security and Privacy, 2011, pp. 447–462 DOI: 10.1109/SP.2011.25
  90. Twitter “About Twitter Blue” Accessed: 2022-12-01, https://help.twitter.com/en/using-twitter/twitter-blue, 2022
  91. Twitter “Academic Research Access Deprecated.” Accessed: 2023-03-30, https://twitter.com/TwitterDev/status/1641222788911624192
  92. “Twitter - Bug Boundy Program” Accessed: 2023-03-25, https://hackerone.com/twitter?type=team
  93. “Twitter - Country Settings” Accessed: 2021-11-21, https://help.twitter.com/en/managing-your-account/how-to-change-country-settings
  94. “Twitter - Country Withheld Content” Accessed: 2021-11-21, https://help.twitter.com/en/rules-and-policies/tweet-withheld-by-country
  95. “Twitter - Rules” Accessed: 2023-04-03, https://help.twitter.com/en/safety-and-security/report-twitter-impersonation
  96. “Twitter - Rules” Accessed: 2023-04-03, https://help.twitter.com/en/rules-and-policies/twitter-rules.html
  97. “Twitter - Rules and Policies” Accessed: 2021-11-19, https://help.twitter.com/en/rules-and-policies/notices-on-twitter
  98. “Twitter - Suspension Rules” Accessed: 2021-11-19, https://blog.twitter.com/en_us/topics/company/2020/suspension
  99. “Twitter Academic API” Accessed: 2021-11-05, https://developer.twitter.com/en/products/twitter-api/academic-research
  100. “Twitter Badge” Accessed: 2021-10-16, https://help.twitter.com/en/managing-your-account/about-twitter-verified-accounts
  101. “Twitter bots” Accessed: 2021-10-27, https://blog.twitter.com/en_us/topics/company/2020/bot-or-not
  102. “Twitter Get-Users” Accessed: 2021-10-16, https://developer.twitter.com/en/docs/twitter-api/v1/accounts-and-users/follow-search-get-users/api-reference/get-users-search, 2021
  103. “Twitter Policies” Accessed: 2023-05-10, https://help.twitter.com/en/rules-and-policies/twitter-impersonation-and-deceptive-identities-policy, 2023
  104. “Twitter Tweet-Lookup” Accessed: 2021-12-01, https://github.com/twitterdev/Twitter-API-v2-sample-code/tree/main/Tweet-Lookup
  105. “Twitter User Gender Classification” Accessed: 2023-06-01, https://www.kaggle.com/datasets/crowdflower/twitter-user-gender-classification, 2016
  106. “Twitter User-Lookup” Accessed: 2021-10-16, https://github.com/twitterdev/Twitter-API-v2-sample-code/blob/main/User-Lookup
  107. “Twitter Username Policy” Accessed: 2021-10-01, https://help.twitter.com/en/managing-your-account/twitter-username-rules
  108. “URLCrazy” Accessed: 2021-10-01, https://morningstarsecurity.com/research/urlcrazy
  109. Jordan Valinsky “Elon Musk rebrands Twitter as X” Accessed: 2023-07-24, https://edition.cnn.com/2023/07/24/tech/twitter-rebrands-x-elon-musk-hnk-intl/index.html, 2023
  110. Alex Wang “Don’t Follow Me - Spam Detection in Twitter.”, 2010, pp. 142–151 DOI: 10.7312/wang15140-003
  111. Jess Weatherbed “Elon Musk says Twitter will begin manually authenticating Blue, Grey, and Gold accounts as soon as next week” Accessed: 2022-12-01, https://www.theverge.com/2022/11/25/23477550/twitter-manual-verification-blue-checkmark-gold-grey, 2022
  112. “Wikipedia Feature Scaling” Accessed: 2021-12-27, https://en.wikipedia.org/wiki/Feature_scaling
  113. “Wikipedia Levenshtein Distance” Accessed: 2021-10-18, https://en.wikipedia.org/wiki/Levenshtein_distance
  114. Matthew L Williams, Pete Burnap and Luke Sloan “Towards an ethical framework for publishing Twitter data in social research: Taking into account users’ views, online context and algorithmic estimation” In Sociology 51.6 Sage Publications Sage UK: London, England, 2017, pp. 1149–1168
  115. “Top 10 algorithms in data mining” In Knowledge and information systems 14.1 Springer, 2008, pp. 1–37
  116. “Deep Entity Classification: Abusive Account Detection for Online Social Networks” In 30th USENIX Security Symposium (USENIX Security 21) USENIX Association, 2021, pp. 4097–4114 URL: https://www.usenix.org/conference/usenixsecurity21/presentation/xu-teng
  117. Chao Yang, Robert Harkreader and Guofei Gu “Die Free or Live Hard? Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers” In Information Forensics and Security, IEEE Transactions on 8, 2011, pp. 318–337 DOI: 10.1109/TIFS.2013.2267732
  118. Morteza Yousefi Kharaji, Fatemeh Salehi Rizi and Mohammad Khayyambashi “A New Approach for Finding Cloned Profiles in Online Social Networks” In ACEEE International Journal on Network Security, 2014
  119. Koosha Zarei, Reza Farahbakhsh and Noel Crespi “Deep Dive on Politician Impersonating Accounts in Social Media”, 2019 DOI: 10.1109/ISCC47284.2019.8969645
  120. “Impersonation on Social Media: A Deep Neural Approach to Identify Ingenuine Content”, 2020 DOI: 10.1109/ASONAM49781.2020.9381437
  121. Chao Michael Zhang and Vern Paxson “Detecting and Analyzing Automated Activity on Twitter” In PAM, 2011
  122. “Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems” In 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 1381–1396 IEEE
  123. “Detecting spammers on social Networks” In Neurocomputing 42, 2015 DOI: 10.1016/j.neucom.2015.02.047
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com