Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 37 tok/s Pro
GPT-5 Medium 15 tok/s Pro
GPT-5 High 11 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 465 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Benchmarking Large Language Models for Log Analysis, Security, and Interpretation (2311.14519v1)

Published 24 Nov 2023 in cs.NI

Abstract: LLMs (LLM) continue to demonstrate their utility in a variety of emergent capabilities in different fields. An area that could benefit from effective language understanding in cybersecurity is the analysis of log files. This work explores LLMs with different architectures (BERT, RoBERTa, DistilRoBERTa, GPT-2, and GPT-Neo) that are benchmarked for their capacity to better analyze application and system log files for security. Specifically, 60 fine-tuned LLMs for log analysis are deployed and benchmarked. The resulting models demonstrate that they can be used to perform log analysis effectively with fine-tuning being particularly important for appropriate domain adaptation to specific log types. The best-performing fine-tuned sequence classification model (DistilRoBERTa) outperforms the current state-of-the-art; with an average F1-Score of 0.998 across six datasets from both web application and system log sources. To achieve this, we propose and implement a new experimentation pipeline (LLM4Sec) which leverages LLMs for log analysis experimentation, evaluation, and analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Gallagher, B., Eliassi-Rad, T.: Classification of http attacks: A study on the ecml/pkdd 2007 discovery challenge (2009) Boffa et al. (2022) Boffa, M., Milan, G., Vassio, L., Drago, I., Mellia, M., Ben Houidi, Z.: Towards nlp-based processing of honeypot logs, 314–321 (2022) https://doi.org/10.1109/EuroSPW55150.2022.00038 Karlsen et al. (2023) Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Boffa, M., Milan, G., Vassio, L., Drago, I., Mellia, M., Ben Houidi, Z.: Towards nlp-based processing of honeypot logs, 314–321 (2022) https://doi.org/10.1109/EuroSPW55150.2022.00038 Karlsen et al. (2023) Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  2. Boffa, M., Milan, G., Vassio, L., Drago, I., Mellia, M., Ben Houidi, Z.: Towards nlp-based processing of honeypot logs, 314–321 (2022) https://doi.org/10.1109/EuroSPW55150.2022.00038 Karlsen et al. (2023) Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  3. Karlsen, E., Copstein, R., Luo, X., Schwartzentruber, J., Niblett, B., Johnston, A., Heywood, M.I., Zincir-Heywood, N.: Exploring semantic vs. syntactic features for unsupervised learning on application log files. In: 2023 7th Cyber Security in Networking Conference (CSNet), pp. 1–7 (2023). https://doi.org/(ÎnPress) Nguyen and Franke (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  4. Nguyen, H.T., Franke, K.: Adaptive intrusion detection system via online machine learning. In: 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 271–277 (2012). https://doi.org/10.1109/HIS.2012.6421346 Moradi Vartouni et al. (2019) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  5. Moradi Vartouni, A., Teshnehlab, M., Sedighian Kashi, S.: Leveraging deep neural networks for anomaly-based web application firewall. IET Information Security 13(4), 352–361 (2019) Bhatnagar et al. (2022) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  6. Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: 2022 International Symposium ELMAR, pp. 57–60 (2022). https://doi.org/10.1109/ELMAR55880.2022.9899790 Farzad and Gulliver (2021) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  7. Farzad, A., Gulliver, T.A.: Log Message Anomaly Detection and Classification Using Auto-B/LSTM and Auto-GRU (2021) Tümer Sivri et al. (2022) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  8. Tümer Sivri, T., Pervan Akman, N., Berkol, A., Peker, C.: Web intrusion detection using character level machine learning approaches with upsampled data, pp. 269–274 (2022). https://doi.org/10.15439/2022F147 Adhikari and Bal (2023) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  9. Adhikari, A., Bal, B.K.: Machine learning technique for intrusion detection in the field of the intrusion detection system. (2023) Copstein et al. (2022) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  10. Copstein, R., Karlsen, E., Schwartzentruber, J., Zincir-Heywood, N., Heywood, M.: Exploring syntactical features for anomaly detection in application logs. it - Information Technology 64(1-2), 15–27 (2022) https://doi.org/10.1515/itit-2021-0064 Nam et al. (2022) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  11. Nam, S., Yoo, J.-H., Hong, J.W.-K.: Vm failure prediction with log analysis using bert-cnn model, 331–337 (2022) https://doi.org/10.23919/CNSM55787.2022.9965187 Wang et al. (2018) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  12. Wang, M., Xu, L., Guo, L.: Anomaly detection of system logs based on natural language processing and deep learning. In: 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), pp. 140–144 (2018). https://doi.org/10.1109/ICFSP.2018.8552075 Qi et al. (2022) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  13. Qi, J., Luan, Z., Huang, S., Wang, Y., Fung, C., Yang, H., Qian, D.: Adanomaly: Adaptive anomaly detection for system logs with adversarial learning. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE Press, ??? (2022). https://doi.org/10.1109/NOMS54207.2022.9789917 . https://doi.org/10.1109/NOMS54207.2022.9789917 Seyyar et al. (2022) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  14. Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the bert model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721 Guo et al. (2021) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  15. Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert, 1–8 (2021). IEEE Shao et al. (2022) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  16. Shao, Y., Zhang, W., Liu, P., Huyue, R., Tang, R., Yin, Q., Li, Q.: Log anomaly detection method based on bert model optimization. In: 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 161–166 (2022). https://doi.org/10.1109/ICCCBDA55098.2022.9778900 Guo et al. (2022) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  17. Guo, H., Lin, X., Yang, J., Zhuang, Y., Bai, J., Zheng, T., Zhang, B., Li, Z.: TransLog: A Unified Transformer-based Framework for Log Anomaly Detection (2022) Le and Zhang (2021) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  18. Le, V.-H., Zhang, H.: Log-based Anomaly Detection Without Log Parsing (2021) Gniewkowski et al. (2023) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  19. Gniewkowski, M., Maciejewski, H., Surmacz, T., Walentynowicz, W.: Sec2vec: Anomaly detection in http traffic and malicious urls. In: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing. SAC ’23, pp. 1154–1162. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3555776.3577663 . https://doi.org/10.1145/3555776.3577663 Hilmi et al. (2020) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  20. Hilmi, M.A.A., Cahyanto, K.A., Mustamiin, M.: Apache Web Server - Access Log Pre-processing for Web Intrusion Detection. IEEE Dataport (2020). https://doi.org/10.21227/vvvq-6w47 . https://dx.doi.org/10.21227/vvvq-6w47 Giménez et al. (2010) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  21. Giménez, C.T., Villegas, A.P., Marañón, G. .: HTTP Dataset CSIC 2010. https://doi.org/10.7910/DVN/3QBYB5 . https://www.tic.itefi.csic.es/dataset/ ECML/PKDD (2021) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  22. ECML/PKDD: ECML/PKDD 2007 Discovery Challenge. https://gitlab.fing.edu.uy/gsi/web-application-attacks-datasets/-/tree/master/ecml_pkdd (2021) Oliner and Stearley (2007) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  23. Oliner, A.J., Stearley, J.: What supercomputers say: A study of five system logs. 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 575–584 (2007) Vaswani et al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  24. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. CoRR abs/1706.03762 (2017) 1706.03762 Schuster and Nakajima (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  25. Schuster, M., Nakajima, K.: Japanese and korean voice search. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5149–5152 (2012). https://doi.org/10.1109/ICASSP.2012.6289079 Liu et al. (2019) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  26. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692 Sanh et al. (2019) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  27. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv abs/1910.01108 (2019) Radford et al. (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  28. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019) Black et al. (2021) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  29. Black, S., Leo, G., Wang, P., Leahy, C., Biderman, S.: GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow. https://doi.org/10.5281/zenodo.5297715 . If you use this software, please cite it using these metadata. https://doi.org/10.5281/zenodo.5297715 Kokalj et al. (2021) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  30. Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., Robnik-Šikonja, M.: BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In: Toivonen, H., Boggia, M. (eds.) Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pp. 16–21. Association for Computational Linguistics, Online (2021). https://aclanthology.org/2021.hackashop-1.3 van der Maaten and Hinton (2008) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  31. Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008) Mohri et al. (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012) Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
  32. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, ??? (2012)
Citations (7)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.