RAT: Reinforcement-Learning-Driven and Adaptive Testing for Vulnerability Discovery in Web Application Firewalls (2312.07885v1)
Abstract: Due to the increasing sophistication of web attacks, Web Application Firewalls (WAFs) have to be tested and updated regularly to resist the relentless flow of web attacks. In practice, using a brute-force attack to discover vulnerabilities is infeasible due to the wide variety of attack patterns. Thus, various black-box testing techniques have been proposed in the literature. However, these techniques suffer from low efficiency. This paper presents Reinforcement-Learning-Driven and Adaptive Testing (RAT), an automated black-box testing strategy to discover injection vulnerabilities in WAFs. In particular, we focus on SQL injection and Cross-site Scripting, which have been among the top ten vulnerabilities over the past decade. More specifically, RAT clusters similar attack samples together. It then utilizes a reinforcement learning technique combined with a novel adaptive search algorithm to discover almost all bypassing attack patterns efficiently. We compare RAT with three state-of-the-art methods considering their objectives. The experiments show that RAT performs 33.53% and 63.16% on average better than its counterparts in discovering the most possible bypassing payloads and reducing the number of attempts before finding the first bypassing payload when testing well-configured WAFs, respectively.
- D. E. Simos, B. Garn, J. Zivanovic, and M. Leithner, “Practical combinatorial testing for xss detection using locally optimized attack models,” in 2019 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW). IEEE, 2019, pp. 122–130.
- K. Chandrasekar, G. Cleary, O. Cox, H. Lau, B. Nahorney, B. O. Gorman, D. O’Brien, S. Wallace, P. Wood, and C. Wueest, “Internet security threat report (ISTR),” Symantec, Tech. Rep. April, 2017.
- O. G. Chapter, “Owasp best practices: Use of web application firewalls.[whitepaper],” 2008.
- A. Tekerek and O. Bay, “Design and implementation of an artificial intelligence-based web application firewall model,” Neural Network World, vol. 29, no. 4, pp. 189–206, 2019.
- A. M. Vartouni, M. Teshnehlab, and S. S. Kashi, “Leveraging deep neural networks for anomaly-based web application firewall,” IET Information Security, vol. 13, no. 4, pp. 352–361, 2019.
- H. Mac, D. Truong, L. Nguyen, H. Nguyen, H. A. Tran, and D. Tran, “Detecting attacks on web applications using autoencoder,” in Proceedings of the Ninth International Symposium on Information and Communication Technology. ACM, 2018, pp. 416–421.
- L. Zhang, D. Zhang, C. Wang, J. Zhao, and Z. Zhang, “Art4sqli: The art of sql injection vulnerability discovery,” IEEE Transactions on Reliability, vol. 68, no. 4, pp. 1470–1489, 2019.
- D. Wichers and J. Williams, “Owasp top-10 2017,” OWASP Foundation, 2017.
- J. Bozic, B. Garn, I. Kapsalis, D. Simos, S. Winkler, and F. Wotawa, “Attack pattern-based combinatorial testing with constraints for web security testing,” in 2015 IEEE International Conference on Software Quality, Reliability and Security. IEEE, 2015, pp. 207–212.
- D. E. Simos, K. Kleine, L. S. G. Ghandehari, B. Garn, and Y. Lei, “A combinatorial approach to analyzing cross-site scripting (xss) vulnerabilities in web application security testing,” in IFIP International Conference on Testing Software and Systems. Springer, 2016, pp. 70–85.
- D. E. Simos, J. Zivanovic, and M. Leithner, “Automated combinatorial testing for detecting sql vulnerabilities in web applications,” in 2019 IEEE/ACM 14th International Workshop on Automation of Software Test (AST). IEEE, 2019, pp. 55–61.
- J. Thomé, A. Gorla, and A. Zeller, “Search-based security testing of web applications,” in Proceedings of the 7th International Workshop on Search-Based Software Testing, 2014, pp. 5–14.
- A. Avancini and M. Ceccato, “Security testing of web applications: A search-based approach for cross-site scripting vulnerabilities,” in 2011 IEEE 11th international working conference on source code analysis and manipulation. IEEE, 2011, pp. 85–94.
- F. Duchene, R. Groz, S. Rawat, and J.-L. Richier, “Xss vulnerability detection using model inference assisted evolutionary fuzzing,” in 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. IEEE, 2012, pp. 815–817.
- L. Demetrio, A. Valenza, G. Costa, and G. Lagorio, “Waf-a-mole: evading web application firewalls through adversarial machine learning,” in Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp. 1745–1752.
- O. Tripp, O. Weisman, and L. Guy, “Finding your way in the testing jungle: a learning approach to web security testing,” in Proceedings of the 2013 International Symposium on Software Testing and Analysis, 2013, pp. 347–357.
- D. Appelt, C. D. Nguyen, and L. Briand, “Behind an application firewall, are we safe from sql injection attacks?” in 2015 IEEE 8th international conference on software testing, verification and validation (ICST). IEEE, 2015, pp. 1–10.
- D. Appelt, C. D. Nguyen, A. Panichella, and L. C. Briand, “A machine-learning-driven evolutionary approach for testing web application firewalls,” IEEE Transactions on Reliability, vol. 67, no. 3, pp. 733–757, 2018.
- C. Lv, L. Zhang, F. Zeng, and J. Zhang, “Adaptive random testing for xss vulnerability,” in 2019 26th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 2019, pp. 63–69.
- G. McGraw, “Software security,” IEEE Security & Privacy, vol. 2, no. 2, pp. 80–83, 2004.
- M. E. Khan, F. Khan et al., “A comparative study of white box, black box and grey box testing techniques,” Int. J. Adv. Comput. Sci. Appl, vol. 3, no. 6, 2012.
- R. Elderman, L. J. Pater, and A. S. Thie, “Adversarial reinforcement learning in a cyber security simulation,” Ph.D. dissertation, Faculty of Science and Engineering, 2016.
- A. Mnih and G. E. Hinton, “A scalable hierarchical distributed language model,” in Advances in neural information processing systems. Citeseer, 2009, pp. 1081–1088.
- S. Overflow, “Stack overflow annual developer survey,” 2019.
- M. Felderer, M. Büchler, M. Johns, A. D. Brucker, R. Breu, and A. Pretschner, “Security testing: A survey,” in Advances in Computers. Elsevier, 2016, vol. 101, pp. 1–51.
- I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, “Practical machine learning tools and techniques,” Morgan Kaufmann, p. 578, 2005.
- K. Singh, H. M. Devi, A. K. Mahanta et al., “Document representation techniques and their effect on the document clustering and classification: A review.” International Journal of Advanced Research in Computer Science, vol. 8, no. 5, 2017.
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
- S.-S. Choi, S.-H. Cha, and C. C. Tappert, “A survey of binary similarity and distance measures,” Journal of Systemics, Cybernetics and Informatics, vol. 8, no. 1, pp. 43–48, 2010.
- S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometrics and intelligent laboratory systems, vol. 2, no. 1-3, pp. 37–52, 1987.
- T. Hofmann, B. Schölkopf, and A. J. Smola, “Kernel methods in machine learning,” The annals of statistics, pp. 1171–1220, 2008.
- R. Min, D. A. Stanley, Z. Yuan, A. Bonner, and Z. Zhang, “A deep non-linear feature mapping for large-margin knn classification,” in 2009 Ninth IEEE International Conference on Data Mining. IEEE, 2009, pp. 357–366.
- D. Chen, J. Lv, and Y. Zhang, “Unsupervised multi-manifold clustering by learning deep representation,” in Workshops at the thirty-first AAAI conference on artificial intelligence, 2017.
- B. Yang, X. Fu, N. D. Sidiropoulos, and M. Hong, “Towards k-means-friendly spaces: Simultaneous deep learning and clustering,” in international conference on machine learning, 2017, pp. 3861–3870.
- K. Ghasedi Dizaji, A. Herandi, C. Deng, W. Cai, and H. Huang, “Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5736–5745.
- P. Huang, Y. Huang, W. Wang, and L. Wang, “Deep embedding network for clustering,” in 2014 22nd International conference on pattern recognition. IEEE, 2014, pp. 1532–1537.
- K. Papineni, “Why inverse document frequency?” in Second Meeting of the North American Chapter of the Association for Computational Linguistics, 2001.
- S. Jabri, A. Dahbi, T. Gadi, and A. Bassir, “Ranking of text documents using tf-idf weighting and association rules mining,” in 2018 4th International Conference on Optimization and Applications (ICOA). IEEE, 2018, pp. 1–6.
- H. Shaziya and R. Zaheer, “Impact of hyperparameters on model development in deep learning,” in Proceedings of International Conference on Computational Intelligence and Data Engineering. Springer, 2021, pp. 57–67.
- P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of computational and applied mathematics, vol. 20, pp. 53–65, 1987.