Ticket-BERT: Labeling Incident Management Tickets with Language Models (2307.00108v1)

Published 30 Jun 2023 in cs.CL, cs.AI, and cs.LG

Abstract: An essential aspect of prioritizing incident tickets for resolution is efficiently labeling tickets with fine-grained categories. However, ticket data is often complex and poses several unique challenges for modern machine learning methods: (1) tickets are created and updated either by machines with pre-defined algorithms or by engineers with domain expertise that share different protocols, (2) tickets receive frequent revisions that update ticket status by modifying all or parts of ticket descriptions, and (3) ticket labeling is time-sensitive and requires knowledge updates and new labels per the rapid software and hardware improvement lifecycle. To handle these issues, we introduce Ticket- BERT which trains a simple yet robust LLM for labeling tickets using our proposed ticket datasets. Experiments demonstrate the superiority of Ticket-BERT over baselines and state-of-the-art text classifiers on Azure Cognitive Services. We further encapsulate Ticket-BERT with an active learning cycle and deploy it on the Microsoft IcM system, which enables the model to quickly finetune on newly-collected tickets with a few annotations.

References (24)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces Ticket-BERT, a specialized BERT model fine-tuned for high-accuracy labeling of complex incident management tickets using a novel dataset.
The study created a unique dataset of 76,000 tickets and employed prompt-prefix strategies, achieving F1 scores up to 98.96 and outperforming baseline text classification methods.
This research enhances operational efficiency in IT incident management and extends transformer model application through domain-specific adaptation and an active learning pipeline for evolving data.

Ticket-BERT: Labeling Incident Management Tickets with LLMs

The paper "Ticket-BERT: Labeling Incident Management Tickets with LLMs," authored by Zhexiong Liu, Cris Benge, and Siduo Jiang, addresses the challenge of efficiently labeling incident management tickets within complex digital environments using advanced LLMs. The focus lies on the development of Ticket-BERT, a specialized BERT-based model, aimed at delivering high performance in categorizing incident tickets into fine-grained categories that align with specific operational issues encountered in enterprise-level systems.

Research Contributions and Methodology

This paper acknowledges the intricate nature of incident tickets, which may be generated and modified by both automated systems and domain experts. The variability and update frequency pose significant challenges for standard machine learning approaches. To counter these issues, the authors develop a comprehensive strategy encompassing the following contributions:

Dataset Creation: The paper introduces a novel dataset, comprising 76,000 raw tickets sourced from Microsoft Azure's Kusto system, categorized into ten distinct incident labels. The dataset is further subdivided into human-written, machine-generated, and mixed datasets, providing a robust foundation for model training and validation.
LLM Development: A bespoke BERT variant, Ticket-BERT, is fine-tuned on the curated dataset. It integrates sophisticated prompt-prefix strategies, using auxiliary ticket data such as titles and summaries to enhance context understanding during classification tasks.
Active Learning Pipeline: To maintain relevance with evolving incident data, an active learning approach is deployed, allowing Ticket-BERT to iteratively learn from newly annotated data, ensuring adaptability to emerging incident scenarios.

Experimental Findings

The paper details extensive evaluations comparing Ticket-BERT with traditional classifiers and baselines, including Naive Bayes and Logistic Regression models utilizing TF-IDF and BoW features, as well as state-of-the-art text classifiers from Azure Cognitive Services. The results exhibit Ticket-BERT's superior performance across the datasets, particularly emphasizing its efficacy in handling human-written tickets, which are characterized by higher linguistic variability. The F1 scores for Ticket-BERT peaked at 98.96, establishing its dominance over existing methods.

A particularly notable advancement showcased in this work is the introduction of prompt-prefix enhancement, which significantly improves ticket labeling performance by incorporating additional context from ticket metadata.

Implications and Future Directions

The implications of this research are manifold. Practically, it enhances the operational efficiency of incident management systems by streamlining the ticket labeling process, which is crucial in reducing downtime and improving service reliability in digital systems. Theoretically, it extends the application scope of transformer-based models, demonstrating the potential of domain-specific adaptations in achieving state-of-the-art performances in specialized, high-stakes areas like IT infrastructure management.

Future research could explore further generalization of Ticket-BERT to handle diverse ticketing environments beyond the Microsoft ecosystem, thus expanding its applicability across various industries. Additionally, enhancing its active learning capabilities to dynamically incorporate emerging issue categories and leveraging unsupervised learning techniques for improved scalability could be promising avenues for continued development.

In summary, this paper underscores the potential of tailored LLMs in addressing complex operational challenges within incident management domains, emphasizing the role of nuanced dataset structuring and active learning in achieving robust, adaptive systems.

PDF Markdown

Related Papers

YouTube

Show All Videos