ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis (2405.08026v1)
Abstract: SMS, or short messaging service, is a widely used and cost-effective communication medium that has sadly turned into a haven for unwanted messages, commonly known as SMS spam. With the rapid adoption of smartphones and Internet connectivity, SMS spam has emerged as a prevalent threat. Spammers have taken notice of the significance of SMS for mobile phone users. Consequently, with the emergence of new cybersecurity threats, the number of SMS spam has expanded significantly in recent years. The unstructured format of SMS data creates significant challenges for SMS spam detection, making it more difficult to successfully fight spam attacks in the cybersecurity domain. In this work, we employ optimized and fine-tuned transformer-based LLMs to solve the problem of spam message detection. We use a benchmark SMS spam dataset for this spam detection and utilize several preprocessing techniques to get clean and noise-free data and solve the class imbalance problem using the text augmentation technique. The overall experiment showed that our optimized fine-tuned BERT (Bidirectional Encoder Representations from Transformers) variant model RoBERTa obtained high accuracy with 99.84\%. We also work with Explainable Artificial Intelligence (XAI) techniques to calculate the positive and negative coefficient scores which explore and explain the fine-tuned model transparency in this text-based spam SMS detection task. In addition, traditional Machine Learning (ML) models were also examined to compare their performance with the transformer-based models. This analysis describes how LLMs can make a good impact on complex textual-based spam data in the cybersecurity field.
- A spam transformer model for sms spam detection, IEEE Access 9 (2021) 80253–80263.
- Spotspam: Intention analysis–driven sms spam detection using bert embeddings, ACM Transactions on the Web (TWEB) 16 (2022) 1–27.
- Detecting spam sms using self attention mechanism, in: International Conference on Intelligent Computing & Optimization, Springer, 2022, pp. 175–184.
- F. Wei, T. Nguyen, A lightweight deep neural model for sms spam detection, in: 2020 International Symposium on Networks, Computers and Communications (ISNCC), IEEE, 2020, pp. 1–6.
- I. H. Sarker, Machine learning for intelligent data analysis and automation in cybersecurity: current and future prospects, Annals of Data Science 10 (2023) 1473–1498.
- M. Amaz Uddin, I. H. Sarker, An explainable transformer-based model for phishing email detection: A large language model approach, arXiv e-prints (2024) arXiv–2402.
- A survey of large language models, arXiv preprint arXiv:2303.18223 (2023).
- A survey on large language model (llm) security and privacy: The good, the bad, and the ugly, arXiv preprint arXiv:2312.02003 1 (2023).
- Attention is all you need, Advances in neural information processing systems 30 (2017).
- Transformer in transformer, Advances in Neural Information Processing Systems 34 (2021) 15908–15919.
- An improved transformer-based model for detecting phishing, spam, and ham: A large language model approach, arXiv preprint arXiv:2311.04913 (2023).
- M. Koroteev, Bert: a review of applications in natural language processing and understanding, arXiv preprint arXiv:2103.11943 (2021).
- A. H. Mohammed, A. H. Ali, Survey of bert (bidirectional encoder representation transformer) types, in: Journal of Physics: Conference Series, volume 1963, IOP Publishing, 2021, p. 012173.
- Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
- Large-scale differentially private bert, arXiv preprint arXiv:2108.01624 (2021).
- Albert: A lite bert for self-supervised learning of language representations, arXiv preprint arXiv:1909.11942 (2019).
- Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692 (2019).
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019).
- Tinybert: Distilling bert for natural language understanding, arXiv preprint arXiv:1909.10351 (2019).
- Explainable ai: A brief survey on history, research areas, approaches and challenges, in: Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8, Springer, 2019, pp. 563–574.
- Explainable ai: A review of machine learning interpretability methods, Entropy 23 (2020) 18.
- Spam sms filtering based on text features and supervised machine learning techniques, Multimedia Tools and Applications 81 (2022) 39853–39871.
- S. Mishra, D. Soni, Dsmishsms-a system to detect smishing sms, Neural Computing and Applications 35 (2023) 4975–4992.
- G. Sonowal, Detecting phishing sms based on multiple correlation algorithms, SN computer science 1 (2020) 361.
- Short message service (sms) spam detection and classification using naïve bayes, in: Conference organizing committee, volume 62, 2021.
- S. Mishra, D. Soni, Smishing detector: A security model to detect smishing through sms content analysis and url behavior analysis, Future Generation Computer Systems 108 (2020) 803–815.
- S-detector: an enhanced security model for detecting smishing attack for mobile computing, Telecommunication Systems 66 (2017) 29–38.
- T. Xia, X. Chen, A discrete hidden markov model for sms spam detection, Applied Sciences 10 (2020) 5011.
- A hybrid cnn-lstm model for sms spam detection in arabic and english messages, Future Internet 12 (2020) 156.
- Deep learning to filter sms spam, Future Generation Computer Systems 102 (2020) 524–533.
- A deep learning method for automatic sms spam classification: Performance of learning algorithms on indigenous dataset, Concurrency and Computation: Practice and Experience 34 (2022) e6989.
- T. Sahmoud, D. M. Mikki, Spam detection using bert, arXiv preprint arXiv:2206.02443 (2022).
- V. S. Tida, S. Hsu, Universal spam detection using transfer learning of bert model, arXiv preprint arXiv:2202.03480 (2022).
- T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.
- Knn model-based approach in classification, in: On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings, Springer, 2003, pp. 986–996.
- L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
- Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition, in: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, 2020, pp. 117–121.
- Roberta-lstm: a hybrid model for sentiment analysis with transformer and recurrent neural network, IEEE Access 10 (2022) 21517–21525.
- P. M. Radiuk, Impact of training set batch size on the performance of convolutional neural networks for diverse datasets, Information Technology and Management Science 20 (2017) 20–24.
- R. Lin, Analysis on the selection of the appropriate batch size in cnn neural network, in: 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE), IEEE, 2022, pp. 106–109.
- B. Schmeiser, Batch size effects in the analysis of simulation output, Operations Research 30 (1982) 556–568.
- I. Loshchilov, F. Hutter, Decoupled weight decay regularization, arXiv preprint arXiv:1711.05101 (2017).
- Understanding adamw through proximal methods and scale-freeness, arXiv preprint arXiv:2202.00089 (2022).
- " why should you trust my explanation?" understanding uncertainty in lime explanations, arXiv preprint arXiv:1904.12991 (2019).
- Explainable ai methods-a brief overview, in: International workshop on extending explainable AI beyond deep models and classifiers, Springer, 2022, pp. 13–38.
- Explainable ai (xai): Core ideas, techniques, and solutions, ACM Computing Surveys 55 (2023) 1–33.
- " why should i trust you?" explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144.
- Mohammad Amaz Uddin (2 papers)
- Muhammad Nazrul Islam (12 papers)
- Leandros Maglaras (41 papers)
- Helge Janicke (38 papers)
- Iqbal H. Sarker (36 papers)