PromptBoosting: Black-Box Text Classification with Ten Forward Passes (2212.09257v2)
Abstract: We describe PromptBoosting, a query-efficient procedure for building a text classifier from a neural LLM (LM) without access to the LM's parameters, gradients, or hidden representations. This form of "black-box" classifier training has become increasingly important as the cost of training and inference in large-scale LMs grows. But existing black-box LM classifier learning approaches are themselves computationally inefficient, typically specializing LMs to the target task by searching in a large space of (discrete or continuous) prompts using zeroth-order optimization methods. Instead of directly optimizing in prompt space, PromptBoosting obtains a small pool of prompts via a gradient-free approach and then constructs a large pool of weak learners by pairing these prompts with different elements of the LM's output distribution. These weak learners are then ensembled using the AdaBoost algorithm. The entire learning process requires only a small number of forward passes and no backward pass. Experiments show that PromptBoosting achieves state-of-the-art performance in multiple black-box few-shot classification tasks, and matches or outperforms full fine-tuning in both few-shot and standard learning paradigms, while training 10x faster than existing black-box methods.
- A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642, 2015.
- Breiman, L. Bagging predictors. Machine learning, 24(2):123–140, 1996.
- Breiman, L. Random forests. Machine learning, 45(1):5–32, 2001.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Clip-tuning: Towards derivative-free prompt learning with a mixture of rewards. arXiv preprint arXiv:2210.12050, 2022.
- The pascal recognising textual entailment challenge. In Machine learning challenges workshop, pp. 177–190, 2005.
- Rlprompt: Optimizing discrete text prompts with reinforcement learning. arXiv preprint arXiv:2205.12548, 2022.
- Black-box prompt learning for pre-trained language models. arXiv preprint arXiv:2201.08531, 2022.
- Automatically constructing a corpus of sentential paraphrases. In Third International Workshop on Paraphrasing (IWP2005), 2005.
- A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55:119–139, 1997.
- Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp. 1189–1232, 2001.
- Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3816–3830, 2021.
- Ptr: Prompt tuning with rules for text classification. arXiv preprint arXiv:2105.11259, 2021.
- Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 168–177, 2004.
- How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438, 2020.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 3045–3059, 2021.
- Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 4582–4597, 2021.
- Gpt understands, too. arXiv:2103.10385, 2021.
- P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
- A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058, 2004.
- Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pp. 115–124, 2005.
- Grips: Gradient-free, edit-based instruction search for prompting large language models. arXiv preprint arXiv:2203.07281, 2022.
- Learning how to ask: Querying lms with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5203–5212, 2021.
- Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67, 2020.
- Squad: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392, 2016.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. In NeurIPS EMC2𝐸𝑀superscript𝐶2EMC^{2}italic_E italic_M italic_C start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT Workshop, 2019.
- Few-shot text generation with pattern-exploiting training. arXiv preprint arXiv:2012.11926, 2020.
- Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269, 2021a.
- It’s not just size that matters: Small language models are also few-shot learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2339–2352, 2021b.
- Autoprompt: Eliciting knowledge from language models with automatically generated prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4222–4235, 2020.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pp. 1631–1642, 2013.
- Bbtv2: Towards a gradient-free future with large language models. In Proceedings of EMNLP, 2022a.
- Black-box tuning for language-model-as-a-service. In Proceedings of ICML, 2022b.
- Stealing machine learning models via prediction {{\{{APIs}}\}}. In 25th USENIX security symposium (USENIX Security 16), pp. 601–618, 2016.
- Building a question answering test collection. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 200–207, 2000.
- Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2018.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903, 2022.
- Annotating expressions of opinions and emotions in language. Language resources and evaluation, 2005.
- A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122, 2018.
- Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771, 2019.
- Bartscore: Evaluating generated text as text generation. Advances in Neural Information Processing Systems, 34:27263–27277, 2021.
- Differentiable prompt makes pre-trained language models better few-shot learners. In International Conference on Learning Representations, 2021.
- Tempera: Test-time prompting via reinforcement learning. arXiv preprint arXiv:2211.11890, 2022.
- Character-level convolutional networks for text classification. In NIPS, 2015.
- Bairu Hou (14 papers)
- Joe O'Connor (2 papers)
- Jacob Andreas (116 papers)
- Shiyu Chang (120 papers)
- Yang Zhang (1129 papers)