Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification (2108.02035v2)

Published 4 Aug 2021 in cs.CL

Abstract: Tuning pre-trained LLMs (PLMs) with task-specific prompts has been a promising approach for text classification. Particularly, previous studies suggest that prompt-tuning has remarkable superiority in the low-data scenario over the generic fine-tuning methods with extra classifiers. The core idea of prompt-tuning is to insert text pieces, i.e., template, to the input and transform a classification problem into a masked LLMing problem, where a crucial step is to construct a projection, i.e., verbalizer, between a label space and a label word space. A verbalizer is usually handcrafted or searched by gradient descent, which may lack coverage and bring considerable bias and high variances to the results. In this work, we focus on incorporating external knowledge into the verbalizer, forming a knowledgeable prompt-tuning (KPT), to improve and stabilize prompt-tuning. Specifically, we expand the label word space of the verbalizer using external knowledge bases (KBs) and refine the expanded label word space with the PLM itself before predicting with the expanded label word space. Extensive experiments on zero and few-shot text classification tasks demonstrate the effectiveness of knowledgeable prompt-tuning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shengding Hu (34 papers)
  2. Ning Ding (122 papers)
  3. Huadong Wang (15 papers)
  4. Zhiyuan Liu (433 papers)
  5. Jingang Wang (71 papers)
  6. Juanzi Li (144 papers)
  7. Wei Wu (482 papers)
  8. Maosong Sun (337 papers)
Citations (320)

Summary

Incorporating Knowledge into Prompt Verbalizer for Text Classification

The paper presents an innovative approach to tuning pre-trained LLMs (PLMs) for text classification, specifically targeting the deficiencies in the verbalizer component of prompt-tuning techniques. The essence of this paper lies in expanding the label word space by incorporating external knowledge bases (KBs), thereby optimizing the verbalizer and facilitating knowledgeable prompt-tuning (KPT).

Prompt-tuning has shown remarkable potential in classification tasks, particularly under data-scarce conditions. However, the construction of the verbalizer—responsible for mapping between the label space and the label word space—has been a bottleneck due to its design's heavy reliance on either manual crafting or data-intensive algorithms. These methods often introduce bias and result in high variance, limiting their robustness, especially in zero-shot and few-shot scenarios.

The proposed KPT addresses these challenges by expanding the verbalizer using external KBs, thereby enlarging the label word space far beyond the constraints of handcrafted or gradient-searched approaches. The methodology is structured into three pivotal stages: construction, refinement, and utilization.

  1. Construction: The expansion leverages KBs to cover a broader range of label words, capturing different granularities and perspectives. For example, in a topic classification task, KPT extends single-word mappings to sets of words (e.g., "science" might encompass "physics," "chemistry," and "biology"), thereby creating a richer vocabulary available for classification.
  2. Refinement: Given the expansion potentially introduces noise, four refining strategies are employed:
    • Frequency Refinement filters low-frequency words using contextualized priors to maintain high-quality predictions.
    • Relevance Refinement assesses and retains label words that are significantly more relevant to their intended classes.
    • Contextualized Calibration adjusts for the inherent bias some label words possess due to frequency effects.
    • Learnable Refinement further tunes these probabilities using labeled data in few-shot settings to adjust the averaging weights for label word contributions.
  3. Utilization: The refined label words are then averaged or weighted averaged when determining the final classification label, with the weighted approach emphasizing probabilities learned during the training phase.

Empirically, KPT demonstrates significant performance boosts across various datasets and classification tasks, reducing error rates by 16% and 18% on zero-shot tasks and showing marked improvements in few-shot scenarios. The approach ensures a consistent reduction in variance, promoting more stable predictions than traditional prompt-tuning methods. The robustness of the method stems from the incorporation of external knowledge, which also shows resilience towards the limitations presented by scarce data availability in few/zero-shot learning settings.

In summary, this work signifies a meaningful stride toward harnessing external knowledge to mitigate the limitations intrinsic to LLM prompting practices. Future investigations could explore incorporating KPT approaches into other NLP tasks beyond text classification, potentially expanding to text generation or a wider array of NLP applications. Furthermore, as KBs evolve, these mappings can be refined and enriched, further enhancing model robustness and performance.