Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning (2312.08027v1)
Abstract: LLMs can be used as accessible and intelligent chatbots by constructing natural language queries and directly inputting the prompt into the LLM. However, different prompt' constructions often lead to uncertainty in the answers and thus make it hard to utilize the specific knowledge of LLMs (like ChatGPT). To alleviate this, we use an interpretable structure to explain the prompt learning principle in LLMs, which certificates that the effectiveness of LLMs is determined by position changes of the task's related tokens. Therefore, we propose MTPrompt, a multi-dimensional task prompt learning method consisting based on task-related object, summary, and task description information. By automatically building and searching for appropriate prompts, our proposed MTPrompt achieves the best results on few-shot samples setting and five different datasets. In addition, we demonstrate the effectiveness and stability of our method in different experimental settings and ablation experiments. In interaction with LLMs, embedding more task-related information into prompts will make it easier to stimulate knowledge embedded in LLMs.
- “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv e-prints, p. arXiv:1810.04805, Oct. 2018.
- “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
- “How can we know what language models know?,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020.
- “The power of scale for parameter-efficient prompt tuning,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 3045–3059.
- “Are prompt-based models clueless?,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 2333–2352.
- “Pre-trained models for natural language processing: A survey,” Science China Technological Sciences, vol. 63, no. 10, pp. 1872–1897, 2020.
- “Exploiting cloze-questions for few-shot text classification and natural language inference,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, Apr. 2021, pp. 255–269, Association for Computational Linguistics.
- “Making pre-trained language models better few-shot learners,” arXiv preprint arXiv:2012.15723, 2020.
- “Do prompt-based models really understand the meaning of their prompts?,” ArXiv, vol. abs/2109.01247, 2021.
- “Gpt understands, too,” arXiv preprint arXiv:2103.10385, 2021.
- “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, Aug. 2021, pp. 4582–4597, Association for Computational Linguistics.
- “Autoprompt: Eliciting knowledge from language models with automatically generated prompts,” arXiv preprint arXiv:2010.15980, 2020.
- “Roberta: A robustly optimized bert pretraining approach,” ArXiv, vol. abs/1907.11692, 2019.
- “Language models as knowledge bases?,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, Nov. 2019, pp. 2463–2473, Association for Computational Linguistics.
- “Open sesame: Getting inside bert’s linguistic knowledge,” 2019.
- “Knowledgeable or educated guess? revisiting language models as knowledge bases,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, Aug. 2021, pp. 1860–1874, Association for Computational Linguistics.
- “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, pp. 9, 2019.
- “How much knowledge can you pack into the parameters of a language model?,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 5418–5426, Association for Computational Linguistics.
- “Automatically identifying words that can serve as labels for few-shot text classification,” in The 28th International Conference on Computational Linguistics, 2020.
- “Ptr: Prompt tuning with rules for text classification,” arXiv preprint arXiv:2105.11259, 2021.
- “Prompt-learning for fine-grained entity typing,” arXiv preprint arXiv:2108.10604, 2021.
- “Few-shot bot: Prompt-based learning for dialogue systems,” CoRR, vol. abs/2110.08118, 2021.
- “Factual probing is [mask]: Learning vs. learning to recall,” 2021.
- Jinta Weng (4 papers)
- Jiarui Zhang (43 papers)
- Yue Hu (220 papers)
- Daidong Fa (1 paper)
- Xiaofeng Xuand (1 paper)
- Heyan Huang (107 papers)