Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer (2404.04886v2)

Published 7 Apr 2024 in cs.CR and cs.AI

Abstract: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT). It can perform pattern guided guessing by incorporating pattern structure information as background knowledge, resulting in a significant increase in the hit rate. Furthermore, we propose D&C-GEN to reduce the repeat rate of generated passwords, which adopts the concept of a divide-and-conquer approach. The primary task of guessing passwords is recursively divided into non-overlapping subtasks. Each subtask inherits the knowledge from the parent task and predicts succeeding tokens. In comparison to the state-of-the-art model, our proposed scheme exhibits the capability to correctly guess 12% more passwords while producing 25% fewer duplicates.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. A. Narayanan and V. Shmatikov, “Fast dictionary attacks on passwords using time-space tradeoff,” in Proceedings of the 12th ACM conference on Computer and communications security, 2005, pp. 364–372.
  2. F. Yu and M. V. Martin, “Gnpassgan: improved generative adversarial networks for trawling offline password guessing,” in 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW).   IEEE, 2022, pp. 10–18.
  3. D. Florêncio, C. Herley, and P. C. Van Oorschot, “An {{\{{Administrator’s}}\}} guide to internet password research,” in 28th large installation system administration conference (LISA14), 2014, pp. 44–61.
  4. R. Morris and K. Thompson, “Password security: A case history,” Commun. ACM, vol. 22, no. 11, p. 594–597, nov 1979. [Online]. Available: https://doi.org/10.1145/359168.359172
  5. M. Weir, S. Aggarwal, B. De Medeiros, and B. Glodek, “Password cracking using probabilistic context-free grammars,” in 2009 30th IEEE symposium on security and privacy.   IEEE, 2009, pp. 391–405.
  6. R. Hranickỳ, L. Zobal, O. Ryšavỳ, D. Kolář, and D. Mikuš, “Distributed pcfg password cracking,” in Computer Security–ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, September 14–18, 2020, Proceedings, Part I 25.   Springer, 2020, pp. 701–719.
  7. S. Houshmand, S. Aggarwal, and R. Flood, “Next gen pcfg password cracking,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 8, pp. 1776–1791, 2015.
  8. W. Han, M. Xu, J. Zhang, C. Wang, K. Zhang, and X. S. Wang, “Transpcfg: transferring the grammars from short passwords to guess long passwords effectively,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 451–465, 2020.
  9. J. Ma, W. Yang, M. Luo, and N. Li, “A study of probabilistic password models,” Annual Information Security Symposium,Annual Information Security Symposium, Mar 2014.
  10. M. Dürmuth, F. Angelstorf, C. Castelluccia, D. Perito, and A. Chaabane, “Omen: Faster password guessing using an ordered markov enumerator,” in Engineering Secure Software and Systems: 7th International Symposium, ESSoS 2015, Milan, Italy, March 4-6, 2015. Proceedings 7.   Springer, 2015, pp. 119–132.
  11. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  12. W. Melicher, B. Ur, S. Komanduri, L. Bauer, N. Christin, and L. Cranor, “Fast, lean, and accurate: Modeling password guessability using neural networks,” USENIX Annual Technical Conference,USENIX Annual Technical Conference, Jan 2017.
  13. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, p. 177–177, Oct 2017. [Online]. Available: http://dx.doi.org/10.3156/jsoft.29.5_177_2
  14. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” Advances in neural information processing systems, vol. 30, 2017.
  15. S. Nam, S. Jeon, and J. Moon, “A new password cracking model with generative adversarial networks,” in Information Security Applications: 20th International Conference, WISA 2019, Jeju Island, South Korea, August 21–24, 2019, Revised Selected Papers 20.   Springer, 2020, pp. 247–258.
  16. B. Hitaj, P. Gasti, G. Ateniese, and F. Perez-Cruz, “Passgan: A deep learning approach for password guessing,” in Applied Cryptography and Network Security: 17th International Conference, ACNS 2019, Bogota, Colombia, June 5–7, 2019, Proceedings.   Berlin, Heidelberg: Springer-Verlag, 2019, p. 217–237. [Online]. Available: https://doi.org/10.1007/978-3-030-21568-2_11
  17. D. Pasquini, A. Gangwal, G. Ateniese, M. Bernaschi, and M. Conti, “Improving password guessing via representation learning,” in 2021 IEEE Symposium on Security and Privacy (SP).   IEEE, 2021, pp. 1382–1399.
  18. K. Yang, X. Hu, Q. Zhang, J. Wei, and W. Liu, “Vaepass: A lightweight passwords guessing model based on variational auto-encoder,” Computers & Security, vol. 114, p. 102587, Mar 2022. [Online]. Available: http://dx.doi.org/10.1016/j.cose.2021.102587
  19. D. Biesner, K. Cvejoski, B. Georgiev, R. Sifa, and E. Krupicka, “Generative deep learning techniques for password generation.” arXiv: Learning,arXiv: Learning, Dec 2020.
  20. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018.
  21. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  22. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  23. J. Rando, F. Perez-Cruz, and B. Hitaj, “Passgpt: Password modeling and (guided) generation with large language models,” arXiv preprint arXiv:2306.01545, 2023.
  24. S. Riley, “Password security: What users know and what they actually do,” Usability News, vol. 8, no. 1, pp. 2833–2836, 2006.
  25. C. Kuo, S. Romanosky, and L. F. Cranor, “Human selection of mnemonic phrase-based passwords,” in Proceedings of the second symposium on Usable privacy and security, 2006, pp. 67–78.
  26. R. Shay, S. Komanduri, P. G. Kelley, P. G. Leon, M. L. Mazurek, L. Bauer, N. Christin, and L. F. Cranor, “Encountering stronger password requirements: user attitudes and behaviors,” in Proceedings of the sixth symposium on usable privacy and security, 2010, pp. 1–20.
  27. J. Bonneau and E. Shutova, “Linguistic properties of multi-word passphrases,” in International conference on financial cryptography and data security.   Springer, 2012, pp. 1–12.
  28. R. Shay, S. Komanduri, A. L. Durity, P. Huh, M. L. Mazurek, S. M. Segreti, B. Ur, L. Bauer, N. Christin, and L. F. Cranor, “Can long passwords be secure and usable?” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2014, pp. 2927–2936.
  29. Wikipedia contributors, “Divide-and-conquer algorithm — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Divide-and-conquer_algorithm&oldid=1173528752, 2023, [Online; accessed 20-October-2023].
  30. J. Bonneau, “The science of guessing: Analyzing an anonymized corpus of 70 million passwords,” in 2012 IEEE Symposium on Security and Privacy, May 2012. [Online]. Available: http://dx.doi.org/10.1109/sp.2012.49
  31. D. Wang, Z. Zhang, P. Wang, J. Yan, and X. Huang, “Targeted online password guessing: An underestimated threat,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Oct 2016. [Online]. Available: http://dx.doi.org/10.1145/2976749.2978339
  32. Y. Li, H. Wang, and K. Sun, “Personal information in passwords and its security implications,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 10, pp. 2320–2333, 2017.
  33. D. Wang, P. Wang, D. He, and Y. Tian, “Birthday, name and bifacial-security: understanding passwords of chinese web users,” in 28th USENIX security symposium (USENIX security 19), 2019, pp. 1537–1555.
  34. Wikipedia contributors, “Data breach — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Data_breach&oldid=1187974960, 2023, [Online; accessed 3-December-2023].
  35. L. Whitney, “Billions of passwords leaked online from past data breaches,” https://www.techrepublic.com/article/billions-of-passwords-leaked-online-from-past-data-breaches/, 2021.
  36. Wikipedia contributors, “Personal data — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Personal_data&oldid=1184949923, 2023, [Online; accessed 6-December-2023].
  37. D. Wang and P. Wang, “The emperor’s new password creation policies: An evaluation of leading web services and the effect of role in resisting against online guessing,” in Computer Security–ESORICS 2015: 20th European Symposium on Research in Computer Security, Vienna, Austria, September 21-25, 2015, Proceedings, Part II 20.   Springer, 2015, pp. 456–477.
  38. Wikipedia contributors, “Markov chain — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Markov_chain&oldid=1179889677, 2023, [Online; accessed 20-October-2023].
  39. “Hashcat: Advanced password recovery,” https://hashcat.net/hashcat/.
  40. Openwall, “John the ripper password cracker,” https://www.openwall.com/john/.
  41. E. Charniak, “Statistical parsing with a context-free grammar and word statistics,” AAAI/IAAI, vol. 2005, no. 598-603, p. 18, 1997.
  42. C. Buck, K. Heafield, and B. Van Ooyen, “N-gram counts and language models from the common crawl.” in LREC, vol. 2, 2014, p. 4.
  43. Wikipedia contributors, “N-gram — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=N-gram&oldid=1188371904, 2023, [Online; accessed 6-December-2023].
  44. P. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, p. 1550–1560, Jan 1990. [Online]. Available: http://dx.doi.org/10.1109/5.58337
  45. Wikipedia contributors, “Autoencoder — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Autoencoder&oldid=1185816731, 2023, [Online; accessed 30-November-2023].
  46. I. Tolstikhin, O. Bousquet, S. Gelly, and B. Schoelkopf, “Wasserstein auto-encoders,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=HkL7n1-0b
  47. D. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv: Machine Learning,arXiv: Machine Learning, Dec 2013.
  48. M. Xu, J. Yu, X. Zhang, C. Wang, S. Zhang, H. Wu, and W. Han, “Improving real-world password guessing attacks via bi-directional transformers,” in 32nd USENIX Security Symposium (USENIX Security 23).   Anaheim, CA: USENIX Association, Aug. 2023, pp. 1001–1018. [Online]. Available: https://www.usenix.org/conference/usenixsecurity23/presentation/xu-ming
  49. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North, Jan 2019. [Online]. Available: http://dx.doi.org/10.18653/v1/n19-1423
  50. A. Cremers and S. Ginsburg, “Context-free grammar forms,” Journal of Computer and System Sciences, vol. 11, no. 1, pp. 86–117, 1975.
  51. “Openai,” https://openai.com/, 2023.
  52. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Neural Information Processing Systems,Neural Information Processing Systems, Jun 2017.
  53. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  54. Wikipedia contributors, “Autoregressive model — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Autoregressive_model&oldid=1183431794, 2023, [Online; accessed 7-November-2023].
  55. M. Xu, C. Wang, J. Yu, J. Zhang, K. Zhang, and W. Han, “Chunk-level password guessing: Towards modeling refined password composition representations,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 5–20.
  56. D. Florêncio, C. Herley, and P. C. Van Oorschot, “Pushing on string: The’don’t care’region of password strength,” Communications of the ACM, vol. 59, no. 11, pp. 66–74, 2016.
  57. J. Tan, L. Bauer, N. Christin, and L. F. Cranor, “Practical recommendations for stronger, more usable passwords combining minimum-strength, minimum-length, and blocklist requirements,” in Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, 2020, pp. 1407–1426.
  58. Wikipedia contributors, “Flow-based generative model — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Flow-based_generative_model&oldid=1172203906, 2023, [Online; accessed 4-December-2023].
  59. G. H. de Rosa and J. P. Papa, “A survey on text generation using generative adversarial networks,” Pattern Recognition, vol. 119, p. 108098, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0031320321002855
  60. S. Islam, H. Elmekki, A. Elsebai, J. Bentahar, N. Drawel, G. Rjoub, and W. Pedrycz, “A comprehensive survey on applications of transformers for deep learning tasks,” Expert Systems with Applications, vol. 241, p. 122666, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417423031688
  61. M. Ji, R. Fu, T. Xing, and F. Yin, “Research on text summarization generation based on lstm and attention mechanism,” in 2021 International Conference on Information Science, Parallel and Distributed Systems (ISPDS), 2021, pp. 214–217.
  62. Wikipedia contributors, “Rockyou — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=RockYou&oldid=1154686206, 2023, [Online; accessed 25-September-2023].
  63. ——, “2012 linkedin hack — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=2012_LinkedIn_hack&oldid=1180726322, 2023, [Online; accessed 6-December-2023].
  64. g. Daniel Miessler, Jason Haddix, “Seclists is the security tester’s companion,” https://github.com/danielmiessler/SecLists/blob/master/Passwords/Leaked-Databases/phpbb.txt, 2019.
  65. S. Khandelwal, “427 million myspace passwords leaked in major security breach,” https://thehackernews.com/2016/06/myspace-passwords-leaked.html, 2016.
  66. Wikipedia contributors, “Yahoo! data breaches — Wikipedia, the free encyclopedia,” https://en.wikipedia.org/w/index.php?title=Yahoo!_data_breaches&oldid=1147596368, 2023, [Online; accessed 25-September-2023].
  67. “GPT2 Hugging Face,” 2023. [Online]. Available: https://huggingface.co/gpt2/tree/main
  68. G. Pagnotta, D. Hitaj, F. De Gaspari, and L. V. Mancini, “Passflow: Guessing passwords with generative flows,” in 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2022, pp. 251–262.
  69. L. Dinh, D. Krueger, and Y. Bengio, “Nice: Non-linear independent components estimation,” arXiv preprint arXiv:1410.8516, 2014.

Summary

We haven't generated a summary for this paper yet.