Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ticket-BERT: Labeling Incident Management Tickets with Language Models (2307.00108v1)

Published 30 Jun 2023 in cs.CL, cs.AI, and cs.LG

Abstract: An essential aspect of prioritizing incident tickets for resolution is efficiently labeling tickets with fine-grained categories. However, ticket data is often complex and poses several unique challenges for modern machine learning methods: (1) tickets are created and updated either by machines with pre-defined algorithms or by engineers with domain expertise that share different protocols, (2) tickets receive frequent revisions that update ticket status by modifying all or parts of ticket descriptions, and (3) ticket labeling is time-sensitive and requires knowledge updates and new labels per the rapid software and hardware improvement lifecycle. To handle these issues, we introduce Ticket- BERT which trains a simple yet robust LLM for labeling tickets using our proposed ticket datasets. Experiments demonstrate the superiority of Ticket-BERT over baselines and state-of-the-art text classifiers on Azure Cognitive Services. We further encapsulate Ticket-BERT with an active learning cycle and deploy it on the Microsoft IcM system, which enables the model to quickly finetune on newly-collected tickets with a few annotations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. A multiapproach generalized framework for automated solution suggestion of support tickets. International Journal of Intelligent Systems, 37(6):3654–3681.
  2. Adrian M. P. Braşoveanu and Răzvan Andonie. 2020. Visualizing transformers for nlp: A brief survey. In 2020 24th International Conference Information Visualisation (IV), pages 270–279.
  3. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  4. Analysis of duplicate issue reports for issue tracking system. In The 3rd International Conference on Data Mining and Intelligent Information Technology Applications, pages 86–91. IEEE.
  5. Multi-dimensional knowledge integration for efficient incident management in a services cloud. In 2009 IEEE International Conference on Services Computing, pages 57–64. IEEE.
  6. Automating itsm incident management process. In 2008 International Conference on Autonomic Computing, pages 141–150. IEEE.
  7. Integrating associative rule-based classification with naïve bayes for text classification. Applied Soft Computing, 69:344–356.
  8. Jianglei Han and Mohammad Akbari. 2018. Vertical domain text classification: towards understanding it tickets using deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
  9. It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In 2013 35th international conference on software engineering (ICSE), pages 392–401. IEEE.
  10. Ammar Ismael Kadhim. 2019. Survey on supervised machine learning techniques for automatic text classification. Artificial Intelligence Review, 52(1):273–292.
  11. Ticket tagger: Machine learning driven issue classification. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 406–409. IEEE.
  12. Transformers in vision: A survey. ACM Computing Surveys (CSUR).
  13. Kraidet Khowongprasoed and Taravichet Titijaroonroj. 2022. Automatic thai ticket classification by using machine learning for it infrastructure company. In 2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE), pages 1–6.
  14. Jong Yong Kim and John Shawe-Taylor. 1992. Fast multiple keyword searching. In Annual Symposium on Combinatorial Pattern Matching, pages 41–51. Springer.
  15. Text classification algorithms: A survey. Information, 10(4):150.
  16. David D Lewis. 1995. A sequential algorithm for training text classifiers: Corrigendum and additional data. In Acm Sigir Forum, volume 29, pages 13–19. ACM New York, NY, USA.
  17. Hierarchical incident ticket classification with minimal supervision. In 2014 IEEE International Conference on Data Mining, pages 923–928. IEEE.
  18. MA Mukunthan and S Selvakumar. 2019. Multilevel petri net-based ticket assignment and it management for improved it organization support. Concurrency and Computation: Practice and Experience, 31(14):e5297.
  19. It ticket classification: the simpler, the better. IEEE Access, 8:193380–193395.
  20. Burr Settles. 2009. Active learning literature survey.
  21. Reetinder Sidhu and Viktor K Prasanna. 2001. Fast regular expression matching using fpgas. In The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’01), pages 227–238. IEEE.
  22. On automating xsede user ticket classification. In Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, pages 1–7.
  23. Clinical text classification with rule-based features and knowledge-guided convolutional neural networks. BMC medical informatics and decision making, 19(3):31–39.
  24. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process, 28(3):150–176.
Citations (3)

Summary

  • The paper introduces Ticket-BERT, a specialized BERT model fine-tuned for high-accuracy labeling of complex incident management tickets using a novel dataset.
  • The study created a unique dataset of 76,000 tickets and employed prompt-prefix strategies, achieving F1 scores up to 98.96 and outperforming baseline text classification methods.
  • This research enhances operational efficiency in IT incident management and extends transformer model application through domain-specific adaptation and an active learning pipeline for evolving data.

Ticket-BERT: Labeling Incident Management Tickets with LLMs

The paper "Ticket-BERT: Labeling Incident Management Tickets with LLMs," authored by Zhexiong Liu, Cris Benge, and Siduo Jiang, addresses the challenge of efficiently labeling incident management tickets within complex digital environments using advanced LLMs. The focus lies on the development of Ticket-BERT, a specialized BERT-based model, aimed at delivering high performance in categorizing incident tickets into fine-grained categories that align with specific operational issues encountered in enterprise-level systems.

Research Contributions and Methodology

This paper acknowledges the intricate nature of incident tickets, which may be generated and modified by both automated systems and domain experts. The variability and update frequency pose significant challenges for standard machine learning approaches. To counter these issues, the authors develop a comprehensive strategy encompassing the following contributions:

  1. Dataset Creation: The paper introduces a novel dataset, comprising 76,000 raw tickets sourced from Microsoft Azure's Kusto system, categorized into ten distinct incident labels. The dataset is further subdivided into human-written, machine-generated, and mixed datasets, providing a robust foundation for model training and validation.
  2. LLM Development: A bespoke BERT variant, Ticket-BERT, is fine-tuned on the curated dataset. It integrates sophisticated prompt-prefix strategies, using auxiliary ticket data such as titles and summaries to enhance context understanding during classification tasks.
  3. Active Learning Pipeline: To maintain relevance with evolving incident data, an active learning approach is deployed, allowing Ticket-BERT to iteratively learn from newly annotated data, ensuring adaptability to emerging incident scenarios.

Experimental Findings

The paper details extensive evaluations comparing Ticket-BERT with traditional classifiers and baselines, including Naive Bayes and Logistic Regression models utilizing TF-IDF and BoW features, as well as state-of-the-art text classifiers from Azure Cognitive Services. The results exhibit Ticket-BERT's superior performance across the datasets, particularly emphasizing its efficacy in handling human-written tickets, which are characterized by higher linguistic variability. The F1 scores for Ticket-BERT peaked at 98.96, establishing its dominance over existing methods.

A particularly notable advancement showcased in this work is the introduction of prompt-prefix enhancement, which significantly improves ticket labeling performance by incorporating additional context from ticket metadata.

Implications and Future Directions

The implications of this research are manifold. Practically, it enhances the operational efficiency of incident management systems by streamlining the ticket labeling process, which is crucial in reducing downtime and improving service reliability in digital systems. Theoretically, it extends the application scope of transformer-based models, demonstrating the potential of domain-specific adaptations in achieving state-of-the-art performances in specialized, high-stakes areas like IT infrastructure management.

Future research could explore further generalization of Ticket-BERT to handle diverse ticketing environments beyond the Microsoft ecosystem, thus expanding its applicability across various industries. Additionally, enhancing its active learning capabilities to dynamically incorporate emerging issue categories and leveraging unsupervised learning techniques for improved scalability could be promising avenues for continued development.

In summary, this paper underscores the potential of tailored LLMs in addressing complex operational challenges within incident management domains, emphasizing the role of nuanced dataset structuring and active learning in achieving robust, adaptive systems.

Youtube Logo Streamline Icon: https://streamlinehq.com