Extensible Prompts for Language Models on Zero-shot Language Style Customization (2212.00616v2)
Abstract: We propose eXtensible Prompt (X-Prompt) for prompting a LLM beyond natural language (NL). X-Prompt instructs an LLM with not only NL but also an extensible vocabulary of imaginary words. Registering new imaginary words allows us to instruct the LLM to comprehend concepts that are difficult to describe with NL words, thereby making a prompt more descriptive. Also, these imaginary words are designed to be out-of-distribution (OOD) robust so that they can be (re)used like NL words in various prompts, distinguishing X-Prompt from soft prompt that is for fitting in-distribution data. We propose context-augmented learning (CAL) to learn imaginary words for general usability, enabling them to work properly in OOD (unseen) prompts. We experiment X-Prompt for zero-shot language style customization as a case study. The promising results of X-Prompt demonstrate its potential to facilitate advanced interaction beyond the natural language interface, bridging the communication gap between humans and LLMs.
- Pada: Example-based prompt learning for on-the-fly adaptation to unseen domains. Transactions of the Association for Computational Linguistics, 10:414–433.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Commonsense knowledge mining from pretrained models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1173–1178, Hong Kong, China. Association for Computational Linguistics.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- An image is worth one word: Personalizing text-to-image generation using textual inversion. In The Eleventh International Conference on Learning Representations.
- Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3816–3830, Online. Association for Computational Linguistics.
- Eventwiki: a knowledge base of major events. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
- In-context autoencoder for context compression in a large language model. arXiv preprint arXiv:2307.06945.
- Demystifying prompts in language models via perplexity estimation. arXiv preprint arXiv:2212.04037.
- Ptr: Prompt tuning with rules for text classification. AI Open, 3:182–192.
- The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751.
- Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. arXiv preprint arXiv:2108.02035.
- How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- Multi-concept customization of text-to-image diffusion. arXiv preprint arXiv:2212.04488.
- The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
- Reducing retraining by recycling parameter-efficient prompts. arXiv preprint arXiv:2208.05577.
- Text revision by on-the-fly representation optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10956–10964.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
- Are sample-efficient nlp models more robust? arXiv preprint arXiv:2210.06456.
- Gpt understands, too. arXiv preprint arXiv:2103.10385.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Guanghui Qin and Jason Eisner. 2021. Learning how to ask: Querying lms with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5203–5212.
- Zero: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–16. IEEE.
- Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may i introduce the gyafc dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140.
- Timo Schick and Hinrich Schütze. 2020. Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676.
- On transferability of prompt tuning for natural language processing. arXiv preprint arXiv:2111.06719.
- Spot: Better frozen model adaptation through soft prompt transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5039–5059.
- Pay attention to your tone: Introducing a new dataset for polite language rewrite. arXiv preprint arXiv:2212.10190.
- Unleashing cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration. arXiv preprint arXiv:2307.05300.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- Challenges in detoxifying language models. arXiv preprint arXiv:2109.07445.
- Formality style transfer with hybrid textual annotations. arXiv preprint arXiv:1903.06353.
- React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
- Tempel: Linking dynamically evolving and newly emerging entities. Advances in Neural Information Processing Systems, 35:1850–1866.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
- Parallel data augmentation for formality style transfer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3221–3228.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.