Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating the Performance of ChatGPT for Spam Email Detection (2402.15537v2)

Published 23 Feb 2024 in cs.CL, cs.AI, cs.CY, and cs.LG

Abstract: Email continues to be a pivotal and extensively utilized communication medium within professional and commercial domains. Nonetheless, the prevalence of spam emails poses a significant challenge for users, disrupting their daily routines and diminishing productivity. Consequently, accurately identifying and filtering spam based on content has become crucial for cybersecurity. Recent advancements in natural language processing, particularly with LLMs like ChatGPT, have shown remarkable performance in tasks such as question answering and text generation. However, its potential in spam identification remains underexplored. To fill in the gap, this study attempts to evaluate ChatGPT's capabilities for spam identification in both English and Chinese email datasets. We employ ChatGPT for spam email detection using in-context learning, which requires a prompt instruction and a few demonstrations. We also investigate how the number of demonstrations in the prompt affects the performance of ChatGPT. For comparison, we also implement five popular benchmark methods, including naive Bayes, support vector machines (SVM), logistic regression (LR), feedforward dense neural networks (DNN), and BERT classifiers. Through extensive experiments, the performance of ChatGPT is significantly worse than deep supervised learning methods in the large English dataset, while it presents superior performance on the low-resourced Chinese dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. M. Hudak, E. Kianičková, and R. Madleňák, “The importance of e-mail marketing in e-commerce,” Procedia engineering, vol. 192, pp. 342–347, 2017.
  2. M. S. Ackerman, L. F. Cranor, and J. Reagle, “Privacy in e-commerce: examining user scenarios and privacy preferences,” in Proceedings of the 1st ACM Conference on Electronic Commerce, pp. 1–8, 1999.
  3. F. Jáñez-Martino, R. Alaíz-Rodríguez, V. González-Castro, E. Fidalgo, and E. Alegre, “A review of spam email detection: analysis of spammer strategies and the dataset shift problem,” Artif. Intell. Rev., vol. 56, no. 2, pp. 1145–1173, 2023.
  4. N. Ahmed, R. Amin, H. Aldabbas, D. Koundal, B. Alouffi, and T. Shah, “Machine learning techniques for spam detection in email and iot platforms: Analysis and research challenges,” Secur. Commun. Networks, vol. 2022, pp. 1862888:1–1862888:19, 2022.
  5. K. V. Samarthrao and V. M. Rohokale, “Enhancement of email spam detection using improved deep learning algorithms for cyber security,” J. Comput. Secur., vol. 30, no. 2, pp. 231–264, 2022.
  6. A. Karim, S. Azam, B. Shanmugam, K. Kannoorpatti, and M. Alazab, “A comprehensive survey for intelligent spam email detection,” IEEE Access, vol. 7, pp. 168261–168295, 2019.
  7. A. A. Akinyelu, “Advances in spam detection for email spam, web spam, social network spam, and review spam: Ml-based and nature-inspired-based techniques,” J. Comput. Secur., vol. 29, no. 5, pp. 473–529, 2021.
  8. P. Kumar, “Predictive analytics for spam email classification using machine learning techniques,” International Journal of Computer Applications in Technology, vol. 64, no. 3, pp. 282–296, 2020.
  9. R. Sharma and G. Kaur, “E-mail spam detection using svm and rbf,” International Journal of Modern Education and Computer Science, vol. 8, no. 4, p. 57, 2016.
  10. U. Bhardwaj and P. Sharma, “Email spam detection using bagging and boosting of machine learning classifiers,” Int. J. Adv. Intell. Paradigms, vol. 24, no. 1/2, pp. 229–253, 2023.
  11. S. Seth and S. Biswas, “Multimodal spam classification using deep learning techniques,” in 2017 13th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 346–349, IEEE, 2017.
  12. M. Fazzolari, F. Buccafurri, G. Lax, and M. Petrocchi, “Experience: improving opinion spam detection by cumulative relative frequency distribution,” Journal of Data and Information Quality (JDIQ), vol. 13, no. 1, pp. 1–16, 2021.
  13. P. Manasa, A. Malik, and I. Batra, “Detection of twitter spam using glove vocabulary features, bidirectional LSTM and convolution neural network,” SN Comput. Sci., vol. 5, no. 1, p. 206, 2024.
  14. J. Doshi, K. Parmar, R. Sanghavi, and N. Shekokar, “A comprehensive dual-layer architecture for phishing and spam email detection,” Comput. Secur., vol. 133, p. 103378, 2023.
  15. S. Zavrak and S. Yilmaz, “Email spam detection using hierarchical attention hybrid deep learning method,” Expert Syst. Appl., vol. 233, p. 120977, 2023.
  16. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
  17. S. Garg, D. Tsipras, P. Liang, and G. Valiant, “What can transformers learn in-context? A case study of simple function classes,” in Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022 (S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, eds.), 2022.
  18. J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” in Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022 (S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, eds.), 2022.
  19. S. A. Khamis, C. F. M. Foozy, M. F. A. Aziz, and N. Rahim, “Header based email spam detection framework using support vector machine (svm) technique,” in Recent Advances on Soft Computing and Data Mining (R. Ghazali, N. M. Nawi, M. M. Deris, and J. H. Abawajy, eds.), (Cham), pp. 57–65, Springer International Publishing, 2020.
  20. W. Awad and S. ELseuofi, “Machine learning methods for e-mail classification,” International Journal of Computer Applications, vol. 16, no. 1, pp. 39–45, 2011.
  21. A. R. Yeruva, D. Kamboj, P. Shankar, U. S. Aswal, A. K. Rao, and C. Somu, “E-mail spam detection using machine learning–knn,” in 2022 5th International Conference on Contemporary Computing and Informatics (IC3I), pp. 1024–1028, IEEE, 2022.
  22. D. Sculley and G. M. Wachman, “Relaxed online svms for spam filtering,” in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 415–422, 2007.
  23. T. M. Ma, K. Yamamori, and A. Thida, “A comparative approach to naïve bayes classifier and support vector machine for email spam classification,” in 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), pp. 324–326, IEEE, 2020.
  24. H. Raj, Y. Weihong, S. K. Banbhrani, and S. P. Dino, “Lstm based short message service (sms) modeling for spam classification,” in Proceedings of the 2018 International Conference on Machine Learning Technologies, pp. 76–80, 2018.
  25. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al., “Improving language understanding by generative pre-training,” 2018.
  26. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  27. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  28. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  29. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27730–27744, 2022.
  30. Y. Liu, T. Han, S. Ma, J. Zhang, Y. Yang, J. Tian, H. He, A. Li, M. He, Z. Liu, Z. Wu, D. Zhu, X. Li, N. Qiang, D. Shen, T. Liu, and B. Ge, “Summary of chatgpt/gpt-4 research and perspective towards the future of large language models,” 2023.
  31. Y. Shi, H. Ma, W. Zhong, G. Mai, X. Li, T. Liu, and J. Huang, “Chatgraph: Interpretable text classification by converting chatgpt knowledge to graphs,” 2023.
  32. B. Zhao, W. Jin, J. D. Ser, and G. Yang, “Chatagri: Exploring potentials of chatgpt on cross-linguistic agricultural text classification,” 2023.
  33. A. Lopez-Lira and Y. Tang, “Can chatgpt forecast stock price movements? return predictability and large language models,” arXiv preprint arXiv:2304.07619, 2023.
  34. Z. Wang, Q. Xie, Z. Ding, Y. Feng, and R. Xia, “Is chatgpt a good sentiment analyzer? a preliminary study,” 2023.
  35. T. Susnjak, “Applying bert and chatgpt for sentiment analysis of lyme disease in scientific literature,” 2023.
  36. S. Sharma, R. Aggarwal, and M. Kumar, “Mining twitter for insights into chatgpt sentiment: A machine learning approach,” in 2023 International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), pp. 1–6, IEEE, 2023.
  37. A. Borji, “A categorical archive of chatgpt failures,” 2023.
  38. M. V. Reiss, “Testing the reliability of chatgpt for text annotation and classification: A cautionary remark,” 2023.
  39. K. Korini and C. Bizer, “Column type annotation using chatgpt,” 2023.
  40. Z. Zhao, E. Wallace, S. Feng, D. Klein, and S. Singh, “Calibrate before use: Improving few-shot performance of language models,” in International Conference on Machine Learning, pp. 12697–12706, PMLR, 2021.
  41. Y. Liu, D. Iter, Y. Xu, S. Wang, R. Xu, and C. Zhu, “Gpteval: Nlg evaluation using gpt-4 with better human alignment,” arXiv preprint arXiv:2303.16634, 2023.
  42. J. Cao, M. Li, M. Wen, and S. chi Cheung, “A study on prompt design, advantages and limitations of chatgpt for deep learning program repair,” 2023.
  43. N. Pérez-Díaz, D. Ruano-Ordas, J. R. Mendez, J. F. Galvez, and F. Fdez-Riverola, “Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification,” Applied Soft Computing, vol. 12, no. 11, pp. 3671–3682, 2012.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuwei Wu (66 papers)
  2. Shijing Si (32 papers)
  3. Yugui Zhang (4 papers)
  4. Jedrek Wosik (2 papers)
  5. Le Tang (2 papers)
Citations (6)