6Rover: Leveraging Reinforcement Learning-based Address Pattern Mining Approach for Discovering Active Targets in IPv6 Unseeded Space (2401.07081v1)
Abstract: The discovery of active IPv6 addresses represents a pivotal challenge in IPv6 network survey, as it is a prerequisite for downstream tasks such as network topology measurements and security analysis. With the rapid spread of IPv6 networks in recent years, many researchers have focused on improving the hit rate, efficiency, and coverage of IPv6 scanning methods, resulting in considerable advancements. However, existing approaches remain heavily dependent on seed addresses, thereby limiting their effectiveness in unseeded prefixes. Consequently, this paper proposes 6Rover, a reinforcement learning-based model for active address discovery in unseeded environments. To overcome the reliance on seeded addresses, 6Rover constructs patterns with higher generality that reflects the actual address allocation strategies of network administrators, thereby avoiding biased transfers of patterns from seeded to unseeded prefixes. After that, 6Rover employs a multi-armed bandit model to optimize the probing resource allocation when applying patterns to unseeded spaces. It models the challenge of discovering optimal patterns in unseeded spaces as an exploration-exploitation dilemma, and progressively uncover the potential patterns applied in unseeded spaces, leading to the efficient discovery of active addresses without seed address as the prior knowledge. Experiments on large-scale unseeded datasets show that 6Rover has a higher hit rate than existing methods in the absence of any seed addresses as prior knowledge. In real network environments, 6Rover achieved a 5% - 8% hit rate in seedless spaces with 100 million budget scale, representing an approximate 200\% improvement over the existing state-of-the-art methods.
- Google, “Ipv6 adoption statistics,” 2023. [Online]. Available: https://www.google.com/intl/en/ipv6/statistics.html
- R. Beverly, R. Durairajan, D. Plonka, and J. P. Rohrer, “In the ip of the beholder: Strategies for active ipv6 topology discovery,” in Proceedings of the Internet Measurement Conference 2018, ser. IMC ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 308–321. [Online]. Available: https://doi.org/10.1145/3278532.3278559
- S. Jia, M. Luckie, B. Huffaker, A. Elmokashfi, E. Aben, K. Claffy, and A. Dhamdhere, “Tracking the deployment of ipv6: Topology, routing and performance,” Computer Networks, vol. 165, p. 106947, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1389128618304092
- N. Liu, C. Jia, B. Hou, C. Hou, Y. Chen, and Z. Cai, “6search: A reinforcement learning-based traceroute approach for efficient ipv6 topology discovery,” Comput. Networks, vol. 235, p. 109987, 2023. [Online]. Available: https://doi.org/10.1016/j.comnet.2023.109987
- Z. Ma, S. Zhang, X. Hu, N. Li, Q. Zhou, F. Liu, H. Wang, G. Hu, and Q. Dong, “Gws-geo: A graph neural network based model for street-level ipv6 geolocation,” J. Inf. Secur. Appl., vol. 75, p. 103511, 2023. [Online]. Available: https://doi.org/10.1016/j.jisa.2023.103511
- Z. Ma, X. Hu, N. Li, H. Feng, S. Zhang, T. Li, F. Liu, Q. Zhou, Z. Tian, H. Wang, and G. Hu, “Hgl-geo: Finer-grained ipv6 geolocation algorithm based on hypergraph learning,” Information Processing & Management, vol. 60, no. 6, p. 103518, 2023.
- E. C. Rye and R. Beverly, “Ipvseeyou: Exploiting leaked identifiers in ipv6 for street-level geolocation,” in 2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 3129–3145.
- Z. Durumeric, E. Wustrow, and J. A. Halderman, “{{\{{ZMap}}\}}: fast internet-wide scanning and its security applications,” in 22nd USENIX Security Symposium (USENIX Security 13), 2013, pp. 605–620.
- L. Izhikevich, R. Teixeira, and Z. Durumeric, “{{\{{LZR}}\}}: Identifying unexpected internet services,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 3111–3128.
- X. Li, B. Liu, X. Zheng, H. Duan, Q. Li, and Y. Huang, “Fast ipv6 network periphery discovery and security implications,” in 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 2021, pp. 88–100.
- F. Li and D. Freeman, “Towards a user-level understanding of ipv6 behavior,” in Proceedings of the ACM Internet Measurement Conference, 2020, pp. 428–442.
- S. J. Saidi, O. Gasser, and G. Smaragdakis, “One bad apple can spoil your ipv6 privacy,” SIGCOMM Comput. Commun. Rev., vol. 52, no. 2, p. 10–19, jun 2022. [Online]. Available: https://doi.org/10.1145/3544912.3544915
- E. Rye, R. Beverly, and K. C. Claffy, “Follow the scent: Defeating ipv6 prefix rotation privacy,” in Proceedings of the 21st ACM Internet Measurement Conference, 2021, pp. 739–752.
- A. Murdock, F. Li, P. Bramsen, Z. Durumeric, and V. Paxson, “Target generation for internet-wide ipv6 scanning,” in Proceedings of the 2017 Internet Measurement Conference, ser. IMC ’17. New York, NY, USA: Association for Computing Machinery, 2017, p. 242–253. [Online]. Available: https://doi.org/10.1145/3131365.3131405
- R. Barnes, R. Altmann, and D. Kerr, “Mapping the great void: Smarter scanning for ipv6,” Proc. CAIDA AIMS-4, 2012.
- T. Fiebig, K. Borgolte, S. Hao, C. Kruegel, and G. Vigna, “Something From Nothing (There): Collecting Global IPv6 Datasets From DNS,” in Proceedings of the 18th Passive and Active Measurement (PAM), ser. Lecture Notes in Computer Science (LNCS), M. A. Kâafar, S. Uhlig, and J. Amann, Eds., vol. 10176. Springer International Publishing, pp. 30–43.
- K. Borgolte, S. Hao, T. Fiebig, and G. Vigna, “Enumerating active ipv6 hosts for large-scale security scans via dnssec-signed reverse zones,” in 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. IEEE Computer Society, 2018, pp. 770–784. [Online]. Available: https://doi.org/10.1109/SP.2018.00027
- O. Gasser, Q. Scheitle, P. Foremski, Q. Lone, M. Korczynski, S. D. Strowes, L. Hendriks, and G. Carle, “Clusters in the expanse: Understanding and unbiasing ipv6 hitlists,” in Proceedings of the 2018 Internet Measurement Conference. New York, NY, USA: ACM, 2018.
- J. Zirngibl, L. Steger, P. Sattler, O. Gasser, and G. Carle, “Rusty clusters? dusting an ipv6 research foundation,” in Proceedings of the 2022 Internet Measurement Conference. New York, NY, USA: ACM, 2022.
- T. Fiebig, K. Borgolte, S. Hao, C. Kruegel, G. Vigna, and A. Feldmann, “In rdns we trust: Revisiting a common data-source’s reliability,” in Passive and Active Measurement, R. Beverly, G. Smaragdakis, and A. Feldmann, Eds. Cham: Springer International Publishing, 2018, pp. 131–145.
- T. Yang, Z. Cai, B. Hou, and T. Zhou, “6forest: An ensemble learning-based approach to target generation for internet-wide ipv6 scanning,” in IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, 2022, pp. 1679–1688.
- T. Yang, B. Hou, Z. Cai, K. Wu, T. Zhou, and C. Wang, “6graph: A graph-theoretic approach to address pattern mining for internet-wide ipv6 scanning,” Computer Networks, vol. 203, p. 108666, 2022.
- T. Cui, G. Gou, G. Xiong, C. Liu, P. Fu, and Z. Li, “6gan: Ipv6 multi-pattern target generation via generative adversarial nets with reinforcement learning,” in 40th IEEE Conference on Computer Communications, INFOCOM 2021, Vancouver, BC, Canada, May 10-13, 2021. IEEE, 2021, pp. 1–10. [Online]. Available: https://doi.org/10.1109/INFOCOM42981.2021.9488912
- T. Cui, G. Xiong, G. Gou, J. Shi, and W. Xia, “6veclm: Language modeling in vector space for ipv6 target generation,” in Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part IV. Berlin, Heidelberg: Springer-Verlag, 2020, p. 192–207. [Online]. Available: https://doi.org/10.1007/978-3-030-67667-4-12
- B. Hou, Z. Cai, K. Wu, J. Su, and Y. Xiong, “6hit: A reinforcement learning-based approach to target generation for internet-wide ipv6 scanning,” in 40th IEEE Conference on Computer Communications, INFOCOM 2021, Vancouver, BC, Canada, May 10-13, 2021. IEEE, 2021, pp. 1–10. [Online]. Available: https://doi.org/10.1109/INFOCOM42981.2021.9488794
- T. Cui, G. Gou, and G. Xiong, “6gcvae: Gated convolutional variational autoencoder for ipv6 target generation,” in Advances in Knowledge Discovery and Data Mining, H. W. Lauw, R. C.-W. Wong, A. Ntoulas, E.-P. Lim, S.-K. Ng, and S. J. Pan, Eds. Cham: Springer International Publishing, 2020, pp. 609–622.
- Z. Liu, Y. Xiong, X. Liu, W. Xie, and P. Zhu, “6tree: Efficient dynamic discovery of active addresses in the ipv6 address space,” Comput. Networks, vol. 155, pp. 31–46, 2019. [Online]. Available: https://doi.org/10.1016/j.comnet.2019.03.010
- B. Hou, Z. Cai, K. Wu, T. Yang, and T. Zhou, “6scan: A high-efficiency dynamic internet-wide ipv6 scanner with regional encoding,” IEEE/ACM Trans. Netw., vol. 31, no. 4, pp. 1870–1885, 2023. [Online]. Available: https://doi.org/10.1109/TNET.2023.3233953
- G. Song, J. Yang, L. He, Z. Wang, G. Li, C. Duan, Y. Liu, and Z. Sun, “Addrminer: A comprehensive global active ipv6 address discovery system,” in 2022 USENIX Annual Technical Conference, USENIX ATC 2022, Carlsbad, CA, USA, July 11-13, 2022, J. Schindler and N. Zilberman, Eds. USENIX Association, 2022, pp. 309–326. [Online]. Available: https://www.usenix.org/conference/atc22/presentation/song
- B. Hou, Z. Cai, K. Wu, T. Yang, and T. Zhou, “Search in the expanse: Towards active and global ipv6 hitlists,” in IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, New York City, NY, USA, May 17-20, 2023. IEEE, 2023, pp. 1–10. [Online]. Available: https://doi.org/10.1109/INFOCOM53939.2023.10229089
- E. C. Rye and D. Levin, “Ipv6 hitlists at scale: Be careful what you wish for,” in Proceedings of the ACM SIGCOMM 2023 Conference, ACM SIGCOMM 2023, New York, NY, USA, 10-14 September 2023, H. Schulzrinne, V. Misra, E. Kohler, and D. A. Maltz, Eds. ACM, 2023, pp. 904–916. [Online]. Available: https://doi.org/10.1145/3603269.3604829
- I. E. T. F. (IETF), “Network reconnaissance in ipv6 networks,” 2016. [Online]. Available: https://www.rfc-editor.org/rfc/rfc7707.html
- O. Gasser, “Ipv6 hitlist service,” 2023. [Online]. Available: https://ipv6hitlist.github.io/
- G. Song, J. Yang, Z. Wang, L. He, J. Lin, L. Pan, C. Duan, and X. Quan, “Det: Enabling efficient probing of ipv6 active addresses,” IEEE/ACM Transactions on Networking, vol. 30, no. 4, pp. 1629–1643, 2022.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” 2022.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper-files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, p. 139–144, oct 2020. [Online]. Available: https://doi.org/10.1145/3422622
- G. Song, “Addrminer-v2.0,” 2023. [Online]. Available: https://github.com/AddrMiner/AddrMiner-v2.0
- RIPE, “Ripestat data api,” 2023. [Online]. Available: https://stat.ripe.net/docs/02.data-api/
- I. E. T. F. (IETF), “Ipv6 address assignment to end sites,” 2011. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc6177
- P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the multiarmed bandit problem,” Machine learning, vol. 47, pp. 235–256, 2002.
- C. Partridge and M. Allman, “Ethical considerations in network measurement papers,” Commun. ACM, vol. 59, no. 10, p. 58–64, sep 2016. [Online]. Available: https://doi.org/10.1145/2896816
- D. Dittrich, E. Kenneally, and M. Bailey, “Applying ethical principles to information and communication technology research: A companion to the menlo report,” Available at SSRN 2342036, 2013.
- N. W. Group, “Ipv6 stateless address autoconfiguration,” 2007. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc4862