KiNETGAN: Enabling Distributed Network Intrusion Detection through Knowledge-Infused Synthetic Data Generation (2405.16476v1)
Abstract: In the realm of IoT/CPS systems connected over mobile networks, traditional intrusion detection methods analyze network traffic across multiple devices using anomaly detection techniques to flag potential security threats. However, these methods face significant privacy challenges, particularly with deep packet inspection and network communication analysis. This type of monitoring is highly intrusive, as it involves examining the content of data packets, which can include personal and sensitive information. Such data scrutiny is often governed by stringent laws and regulations, especially in environments like smart homes where data privacy is paramount. Synthetic data offers a promising solution by mimicking real network behavior without revealing sensitive details. Generative models such as Generative Adversarial Networks (GANs) can produce synthetic data, but they often struggle to generate realistic data in specialized domains like network activity. This limitation stems from insufficient training data, which impedes the model's ability to grasp the domain's rules and constraints adequately. Moreover, the scarcity of training data exacerbates the problem of class imbalance in intrusion detection methods. To address these challenges, we propose a Privacy-Driven framework that utilizes a knowledge-infused Generative Adversarial Network for generating synthetic network activity data (KiNETGAN). This approach enhances the resilience of distributed intrusion detection while addressing privacy concerns. Our Knowledge Guided GAN produces realistic representations of network activity, validated through rigorous experimentation. We demonstrate that KiNETGAN maintains minimal accuracy loss in downstream tasks, effectively balancing data privacy and utility.
- K. A. Da Costa, J. P. Papa, C. O. Lisboa, R. Munoz, and V. H. C. de Albuquerque, “Internet of things: A survey on machine learning-based intrusion detection approaches,” Computer Networks, vol. 151, pp. 147–157, 2019.
- D. Ucci, L. Aniello, and R. Baldoni, “Survey of machine learning techniques for malware analysis,” Computers & Security, vol. 81, pp. 123–147, 2019.
- D. Ding, Q.-L. Han, Y. Xiang, X. Ge, and X.-M. Zhang, “A survey on security control and attack detection for industrial cyber-physical systems,” Neurocomputing, vol. 275, pp. 1674–1683, 2018.
- A. Piplai, S. Mittal, A. Joshi, T. Finin, J. Holt, and R. Zak, “Creating cybersecurity knowledge graphs from malware after action reports,” IEEE Access, vol. 8, pp. 211 691–211 703, 2020.
- A. Piplai, P. Ranade, A. Kotal, S. Mittal, S. N. Narayanan, and A. Joshi, “Using knowledge graphs and reinforcement learning for malware analysis,” in 2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020, pp. 2626–2633.
- S. Dasgupta, A. Piplai, A. Kotal, and A. Joshi, “A comparative study of deep learning based named entity recognition algorithms for cybersecurity,” in 2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020, pp. 2596–2604.
- N. Das, A. Kotal, D. Roseberry, and A. Joshi, “Change management using generative modeling on digital twins,” in 2023 IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 2023, pp. 1–6.
- A. Brock, J. Donahue, and K. Simonyan, “Large scale gan training for high fidelity natural image synthesis,” arXiv preprint arXiv:1809.11096, 2018.
- P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.
- H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. N. Metaxas, “Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5907–5915.
- T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “High-resolution image synthesis and semantic manipulation with conditional gans,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8798–8807.
- L. Xu, M. Skoularidou, A. Cuesta-Infante, and K. Veeramachaneni, “Modeling tabular data using conditional gan,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- A. Kotal, A. Piplai, S. S. L. Chukkapalli, and A. Joshi, “Privetab: Secure and privacy-preserving sharing of tabular data,” in Proceedings of the 2022 ACM on International Workshop on Security and Privacy Analytics, 2022, pp. 35–45.
- M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 308–318.
- L. Xie, K. Lin, S. Wang, F. Wang, and J. Zhou, “Differentially private generative adversarial network,” arXiv preprint arXiv:1802.06739, 2018.
- R. Torkzadehmahani, P. Kairouz, and B. Paten, “Dp-cgan: Differentially private synthetic data and label generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0.
- S. Hui, H. Wang, Z. Wang, X. Yang, Z. Liu, D. Jin, and Y. Li, “Knowledge enhanced gan for iot traffic generation,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 3336–3346.
- S. N. Narayanan, A. Ganesan, K. Joshi, T. Oates, A. Joshi, and T. Finin, “Early detection of cybersecurity threats using collaborative cognition,” in 2018 IEEE 4th international conference on collaboration and internet computing (CIC). IEEE, 2018, pp. 354–363.
- L. Elluri, A. Piplai, A. Kotal, A. Joshi, and K. P. Joshi, “A policy-driven approach to secure extraction of covid-19 data from research papers,” Frontiers in big Data, vol. 4, p. 701966, 2021.
- A. Piplai, A. Kotal, S. Mohseni, M. Gaur, S. Mittal, and A. Joshi, “Knowledge-enhanced neurosymbolic artificial intelligence for cybersecurity and privacy,” IEEE Internet Computing, vol. 27, no. 5, pp. 43–48, 2023.
- M. Kampffmeyer, Y. Chen, X. Liang, H. Wang, Y. Zhang, and E. P. Xing, “Rethinking knowledge graph propagation for zero-shot learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 11 487–11 496.
- J. Chen, Y. Geng, Z. Chen, I. Horrocks, J. Z. Pan, and H. Chen, “Knowledge-aware zero-shot learning: Survey and perspective,” arXiv preprint arXiv:2103.00070, 2021.
- A. Kotal, N. Das, A. Joshi et al., “Knowledge infusion in privacy preserving data generation,” in KDD Workshop on Knowledge-infused Learning, 29TH ACM SIGKDD,, 2023.
- A. Kotal, L. Elluri, D. Gupta, V. Mandalapu, and A. Joshi, “Privacy-preserving data sharing in agriculture: Enforcing policy rules for secure and confidential data synthesis,” in 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023, pp. 5519–5528.
- J. Kim, J. Jeon, J. Lee, J. Hyeong, and N. Park, “Oct-gan: Neural ode-based conditional tabular gans,” in Proceedings of the Web Conference 2021, 2021, pp. 1506–1515.
- J. Jordon, J. Yoon, and M. Van Der Schaar, “Pate-gan: Generating synthetic data with differential privacy guarantees,” in International conference on learning representations, 2018.
- N. Park, M. Mohammadi, K. Gorde, S. Jajodia, H. Park, and Y. Kim, “Data synthesis based on generative adversarial networks,” Proceedings of the VLDB Endowment, vol. 11, no. 10, pp. 1071–1083, 2018.
- Anantaa Kotal (6 papers)
- Brandon Luton (1 paper)
- Anupam Joshi (23 papers)