Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training (2305.13723v2)

Published 23 May 2023 in cs.CL

Abstract: Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts. Most existing methods first use the label names as static keyword-based features to generate pseudo labels, which are then used for final classifier training. While reasonable, such a commonly adopted framework suffers from two limitations: (1) keywords can have different meanings in different contexts and some text may not have any keyword, so keyword matching can induce noisy and inadequate pseudo labels; (2) the errors made in the pseudo label generation stage will directly propagate to the classifier training stage without a chance of being corrected. In this paper, we propose a new method, PIEClass, consisting of two modules: (1) a pseudo label acquisition module that uses zero-shot prompting of pre-trained LLMs (PLM) to get pseudo labels based on contextualized text understanding beyond static keyword matching, and (2) a noise-robust iterative ensemble training module that iteratively trains classifiers and updates pseudo labels by utilizing two PLM fine-tuning methods that regularize each other. Extensive experiments show that PIEClass achieves overall better performance than existing strong baselines on seven benchmark datasets and even achieves similar performance to fully-supervised classifiers on sentiment classification tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Eugene Agichtein and Luis Gravano. 2000. Snowball: extracting relations from large plain-text collections. In Digital library.
  2. Learning from rules generalizing labeled exemplars. In ICLR.
  3. Data programming for learning discourse structure. In ACL.
  4. Avrim Blum and Tom M. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In COLT.
  5. Importance of semantic representation: Dataless classification. In AAAI.
  6. Data programming using continuous and quality-guided labeling functions. In AAAI.
  7. Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In WWW.
  8. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR.
  9. Commonsense knowledge mining from pretrained models. In EMNLP-IJCNLP.
  10. Rlprompt: Optimizing discrete text prompts with reinforcement learning. In EMNLP.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT.
  12. Prompt-learning for fine-grained entity typing. arXiv preprint arXiv:2108.10604.
  13. Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI.
  14. Making pre-trained language models better few-shot learners. In ACL-IJCNLP.
  15. Efficient (soft) q-learning for text generation with limited good data. In EMNLP Findings.
  16. Ptr: Prompt tuning with rules for text classification. arXiv preprint arXiv:2105.11259.
  17. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In ACL.
  18. Few-shot fine-grained entity typing with automatic label interpretation and instance generation. In KDD.
  19. MEGClass: Text classification with extremely weak supervision via mutually-enhancing text granularities. In EMNLP.
  20. Samuli Laine and Timo Aila. 2017. Temporal ensembling for semi-supervised learning. In ICLR.
  21. Co-training improves prompt-based learning for large language models. In ICML.
  22. Ken Lang. 1995. Newsweeder: Learning to filter netnews. In ICML.
  23. The power of scale for parameter-efficient prompt tuning. In EMNLP.
  24. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In ACL-IJCNLP.
  25. Pre-trained token-replaced detection model as few-shot learner. In COLING.
  26. Roberta: A robustly optimized bert pretraining approach. ArXiv, abs/1907.11692.
  27. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. In ICLR.
  28. Learning word vectors for sentiment analysis. In ACL.
  29. Semi-supervised data programming with subset selection. In ACL-IJCNLP Findings.
  30. Learning to robustly aggregate labeling functions for semi-supervised data programming. In ACL Findings.
  31. Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: understanding rating dimensions with review text. In RecSys’13.
  32. LOPS: Learning order inspired pseudo-label selection for weakly supervised text classification. In EMNLP Findings.
  33. Dheeraj Mekala and Jingbo Shang. 2020. Contextualized weak supervision for text classification. In ACL.
  34. Weakly-supervised neural text classification. In CIKM.
  35. Distantly-supervised named entity recognition with noise-robust learning and language model augmented self-training. In EMNLP.
  36. Text classification using label names only: A language model self-training approach. In EMNLP.
  37. OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
  38. Seongmin Park and Jihwa Lee. 2022. LIME: Weakly-supervised text classification without seeds. In COLING.
  39. Graph alignment with noisy supervision. In WWW.
  40. Language models as knowledge bases? In EMNLP-IJCNLP.
  41. Automatic rule induction for efficient semi-supervised learning. In EMNLP Findings.
  42. Language models are unsupervised multitask learners.
  43. SQuAD: 100,000+ questions for machine comprehension of text. In EMNLP.
  44. Data programming: Creating large training sets, quickly. In NIPS.
  45. Denoising multi-source weak supervision for neural text classification. In EMNLP Findings.
  46. Evan Sandhaus. 2008. The New York Times Annotated Corpus.
  47. Automatically identifying words that can serve as labels for few-shot text classification. In COLING.
  48. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In EMNLP.
  49. Learning with weak supervision for email intent detection. In SIGIR.
  50. Language models in the loop: Incorporating prompting into weak supervision. ArXiv, abs/2205.02318.
  51. Yangqiu Song and Dan Roth. 2014. On dataless hierarchical text classification. In AAAI.
  52. Document modeling with gated recurrent neural network for sentiment classification. In EMNLP.
  53. Doc2cube: Allocating documents to text cube without labeled data. In ICDM.
  54. Paroma Varma and Christopher Ré. 2018. Snuba: Automating weak supervision to label training data. Proc. VLDB Endow., 12:223–236.
  55. X-class: Text classification with extremely weak supervision. In NAACL.
  56. Prompting electra: Few-shot learning with discriminative pre-trained models. In EMNLP.
  57. Hierarchical attention networks for document classification. In NAACL-HLT.
  58. Prompt tuning for discriminative pre-trained language models. In ACL Findings.
  59. Weakly supervised text classification using supervision signals from a language model. In NAACL Findings.
  60. Weakly-supervised text classification based on keyword graph. In EMNLP.
  61. Prompt-based rule discovery and boosting for interactive weakly-supervised learning. In ACL.
  62. Character-level convolutional networks for text classification. In NIPS.
  63. Motifclass: Weakly supervised text classification with higher-order metadata information. In WSDM.
  64. Unsupervised key event detection from massive text corpora. In KDD.
  65. Empower entity set expansion via language model probing. In ACL.
  66. Zhilu Zhang and Mert R. Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. In NIPS.
  67. Factual probing is [MASK]: Learning vs. learning to recall. In NAACL.
  68. Weaker than you think: A critical look at weakly supervised learning. In ACL.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yunyi Zhang (39 papers)
  2. Minhao Jiang (10 papers)
  3. Yu Meng (92 papers)
  4. Yu Zhang (1399 papers)
  5. Jiawei Han (263 papers)
Citations (10)
X Twitter Logo Streamline Icon: https://streamlinehq.com