Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network (2006.05702v1)

Published 10 Jun 2020 in cs.CL and cs.LG

Abstract: In this paper, we explore the slot tagging with only a few labeled support sentences (a.k.a. few-shot). Few-shot slot tagging faces a unique challenge compared to the other few-shot classification problems as it calls for modeling the dependencies between labels. But it is hard to apply previously learned label dependencies to an unseen domain, due to the discrepancy of label sets. To tackle this, we introduce a collapsed dependency transfer mechanism into the conditional random field (CRF) to transfer abstract label dependency patterns as transition scores. In the few-shot setting, the emission score of CRF can be calculated as a word's similarity to the representation of each label. To calculate such similarity, we propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model -- TapNet, by leveraging label name semantics in representing labels. Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yutai Hou (23 papers)
  2. Wanxiang Che (152 papers)
  3. Yongkui Lai (4 papers)
  4. Zhihan Zhou (17 papers)
  5. Yijia Liu (19 papers)
  6. Han Liu (340 papers)
  7. Ting Liu (329 papers)
Citations (182)

Summary

Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network

The paper "Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network" focuses on advancing methodologies within few-shot learning for slot tagging tasks in dialogue systems. Slot tagging involves assigning labels to words within a sentence to extract semantic meaning, a process critical for understanding and generating appropriate system responses within task-oriented dialogue systems. The primary challenge addressed in this paper is the scarcity of labeled data in new domains, which restricts the development and adaptability of slot tagging models.

Core Contributions

  1. Collapsed Dependency Transfer Mechanism: This novel approach addresses the hurdle of transferring learned label dependencies to domains with new and unseen label sets. The technique involves collapsing domain-specific labels into abstract representations, thereby modeling label dependencies across different domains. This is implemented as transition scores within Conditional Random Fields (CRFs). Such abstract representations allow the sharing of label dependency across domains while mitigating the issue of differing label sets.
  2. Label-enhanced Task-Adaptive Projection Network (L-TapNet): Building upon the TapNet model, L-TapNet integrates improvement in label representation by incorporating label name semantics. The model adapts embeddings to a projected space where words associated with different labels are well-separated, thus reducing misclassification caused by closely distributed label embeddings. Critically, the paper leverages the semantics of label names to enhance label representations, facilitating more accurate slot tagging.

Experimental Results

The experimental validation, which encompasses one-shot and five-shot scenarios across several domains, demonstrates substantial improvements over previous methodologies. Notably, the proposed model surpassed the strongest few-shot baseline by a margin of 14.64 F1 score in the one-shot setting. This attests to the robustness of the displayed techniques in generalizing across various domains with minimal labeled examples.

Implications and Future Directions

The implications of this research are significant both theoretically and practically. On a theoretical level, the work pushes forward the boundary of few-shot learning by introducing mechanisms that efficiently handle cross-domain label dependencies and enhance label representation via semantic information. Practically, this is notable for systems like task-oriented dialogue agents, which often require adaptability to rapidly evolving and diverse domains with scarce labeled data.

Future developments may focus on extending the collapsed dependency transfer mechanism to more complex label structures or exploring other types of semantic augmentations. Additionally, refining these methods for large-scale application in real-world systems might involve optimizing computational requirements associated with projection and embedding techniques.

The accomplishments and innovative techniques within this paper serve as a promising catalyst for further research into few-shot learning paradigms, especially as it pertains to tasks requiring rapid adaptability, like slot tagging in natural language processing.