Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation (2309.12075v3)

Published 21 Sep 2023 in cs.CL and cs.AI

Abstract: Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained LLMs (PLMs), which are often referred to as LLMs. This study benchmarks the performance and computational efficiency of Prompt Tuning and baselines for multi-label text classification. This is applied to the challenging task of classifying companies into an investment firm's proprietary industry taxonomy, supporting their thematic investment strategy. Text-to-text classification is frequently reported to outperform task-specific classification heads, but has several limitations when applied to a multi-label classification problem where each label consists of multiple tokens: (a) Generated labels may not match any label in the label taxonomy; (b) The fine-tuning process lacks permutation invariance and is sensitive to the order of the provided labels; (c) The model provides binary decisions rather than appropriate confidence scores. Limitation (a) is addressed by applying constrained decoding using Trie Search, which slightly improves classification performance. All limitations (a), (b), and (c) are addressed by replacing the PLM's language head with a classification head, which is referred to as Prompt Tuned Embedding Classification (PTEC). This improves performance significantly, while also reducing computational costs during inference. In our industrial application, the training data is skewed towards well-known companies. We confirm that the model's performance is consistent across both well-known and less-known companies. Our overall results indicate the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of PLMs with strong generalization abilities. We release our codebase and a benchmarking dataset at https://github.com/EQTPartners/PTEC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  2. Recall and learn: Fine-tuning deep pretrained language models with less forgetting. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7870–7881.
  3. Order-free rnn with visual attention for multi-label classification. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
  4. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  5. T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27.
  6. Crunchbase. 2024. Crunchbase. Accessed: 2024-01-22.
  7. Autoregressive entity retrieval. In International Conference on Learning Representations.
  8. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5(3):220–235.
  9. A theoretical analysis of the repetition problem in text generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12848–12856.
  10. Knn model-based approach in classification. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings, pages 986–996. Springer.
  11. WARP: Word-level Adversarial ReProgramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4921–4933, Online. Association for Computational Linguistics.
  12. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  13. Low-resource text classification: A parameter-free classification method with compressors. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6810–6828.
  14. Cluster-guided label generation in extreme multi-label classification. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1662–1677.
  15. Language models (mostly) know what they know. arXiv preprint arXiv:2207.05221.
  16. Text classification algorithms: A survey. Information, 10(4):150.
  17. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059.
  18. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597.
  19. Gpt understands, too. arxiv. arXiv preprint arXiv:2103.10385.
  20. P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 61–68.
  21. Mary L McHugh. 2012. Interrater reliability: the kappa statistic. Biochemia medica, 22(3):276–282.
  22. Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11048–11064.
  23. Nadim Nachar et al. 2008. The mann-whitney u: A test for assessing whether two independent samples come from the same distribution. Tutorials in quantitative Methods for Psychology, 4(1):13–20.
  24. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  25. Pitchbook. 2024. Pitchbook. Accessed: 2024-01-22.
  26. PyTorch. 2024. Pytorch. Accessed: 2024-01-22.
  27. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  28. Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In Proceedings of the International AAAI Conference on Web and Social Media, volume 12.
  29. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  30. Timo Schick and Hinrich Schütze. 2021. It’s not just size that matters: Small language models are also few-shot learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2339–2352.
  31. On the stratification of multi-label data. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2011, Athens, Greece, September 5-9, 2011, Proceedings, Part III 22, pages 145–158. Springer.
  32. Open vocabulary extreme classification using generative models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1561–1583.
  33. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  34. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
  35. Evaluating the factual consistency of large language models through news summarization. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5220–5255.
  36. Parameter-efficient prompt tuning makes generalized and calibrated neural text retrievers. arXiv preprint arXiv:2207.07087.
  37. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  38. Prompt-tuning can be much better than fine-tuning on cross-lingual understanding with multilingual language models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5478–5485, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  39. Sgm: Sequence generation model for multi-label classification. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3915–3926.
  40. Multi-label few-shot icd coding as autoregressive generation with prompt. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 5366–5374.
Citations (1)

Summary

We haven't generated a summary for this paper yet.