A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data (1709.03082v8)

Published 10 Sep 2017 in cs.NE, cs.CR, cs.LG, and stat.ML

Abstract: Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are types of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing (Wen et al., 2015), speech recognition (Chorowski et al., 2015), and text classification (Yang et al., 2016). Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies (Alalshekmubarak & Smith, 2013; Tang, 2013), this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a training accuracy of ~81.54% and a testing accuracy of ~84.15%, while the latter was able to reach a training accuracy of ~63.07% and a testing accuracy of ~70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time - a theoretical implication which was supported by the actual training and testing time in the study.

Citations (207)

View on Semantic Scholar

Summary

The paper introduces a GRU-SVM model that replaces the conventional Softmax layer with an SVM to improve binary classification in intrusion detection.
The model achieved approximately 81.54% training and 84.15% testing accuracy, outperforming traditional GRU-Softmax architectures.
The study leverages efficient preprocessing and the Adam algorithm to reduce computational costs and runtime while ensuring robust performance.

GRU-SVM: A Hybrid Approach for Intrusion Detection in Network Traffic Data

The paper by Abien Fred M. Agarap presents a novel neural network architecture that integrates a Gated Recurrent Unit (GRU) with a Support Vector Machine (SVM), focusing on intrusion detection within network traffic data. This research introduces a departure from the conventional use of the Softmax function as the final layer in recurrent neural networks (RNNs), instead leveraging the margin-based objective function of SVMs. This paper utilized the 2013 dataset from the honeypot systems of Kyoto University to provide empirical evidence for the effectiveness of the proposed model.

Motivation and Methodology

Intrusion detection systems (IDS) are critical in identifying unauthorized network access, a significant contributor to global cybercrime. Traditional manual analysis of user activity data is labor-intensive due to data volume, highlighting the necessity for automated detection methods via machine learning. Prior works have suggested that a combination of ANN and SVM can enhance time-series classification tasks. Building on this premise, the paper proposes a GRU-SVM model tailored for binary classification in intrusion detection.

Google TensorFlow was employed to implement the neural network models, with the Kyoto University dataset serving as the experimental data source. A subset, specifically $\approx$ 25% of the original 16.2 GB dataset, was preprocessed via standardization and binning techniques to improve computational efficiency and classification performance.

The proposed GRU-SVM model utilizes a GRU layer to manage sequential data, followed by an SVM classifier that replaces Softmax. The training and prediction processes are optimized through the use of the Adam algorithm, aiming to minimize the SVM's L2 loss function. The architecture’s decision function and its derivative contribute to the learning process, granting the model its predictive capabilities.

Results and Analysis

The GRU-SVM model demonstrated superior performance compared to the conventional GRU-Softmax model. In training, the GRU-SVM achieved an accuracy of $\approx$ 81.54%, while testing accuracy reached $\approx$ 84.15%. In contrast, the GRU-Softmax model obtained lower accuracies at both stages, with $\approx$ 63.07% during training and $\approx$ 70.75% during testing. Furthermore, the GRU-SVM model exhibited faster runtime during both training and testing phases.

The enhanced efficiency of the proposed model is attributed to SVM's suitability for binary classification tasks and its lower computational complexity in the prediction phase, which empirically validates theoretical expectations. Notably, SVM's approach to classification—focusing on margins rather than class probability distributions—offers a pragmatic advantage over Softmax, particularly for binary classification scenarios.

Discussion and Implications

This research highlights the influential role of SVM's predictive efficiency and accuracy in binary classification tasks. The GRU-SVM model not only achieved higher predictive performance but also demonstrated reduced computational costs compared to GRU-Softmax. The findings suggest promising avenues for applying the GRU-SVM architecture in a broader range of binary classification tasks, potentially extending beyond intrusion detection.

Despite these results, the work acknowledges the need for further empirical validation across different datasets and binary classification tasks. Moreover, exploring the GRU-SVM model's adaptability for multinomial classifications could yield valuable insights into optimizing machine learning models for varied applications. The paper also hypothesizes potential issues with the Softmax function in binary contexts that warrant additional exploration.

Conclusion

Agarap's research proposes a significant adaptation to GRU neural networks by integrating SVM for enhanced binary classification performance in intrusion detection. The empirical advantages demonstrated by the GRU-SVM model position it as a practical alternative to traditional architectures in similar contexts. Future work to validate and extend these findings could contribute substantially to the field of AI-driven cybersecurity and other applications requiring efficient binary classification solutions.