Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression (2402.16058v1)
Abstract: LLMs require lengthy prompts as the input context to produce output aligned with user intentions, a process that incurs extra costs during inference. In this paper, we propose the Gist COnditioned deCOding (Gist-COCO) model, introducing a novel method for compressing prompts which also can assist the prompt interpretation and engineering. Gist-COCO employs an encoder-decoder based LLM and then incorporates an additional encoder as a plugin module to compress prompts with inputs using gist tokens. It finetunes the compression plugin module and uses the representations of gist tokens to emulate the raw prompts in the vanilla LLM. By verbalizing the representations of gist tokens into gist prompts, the compression ability of Gist-COCO can be generalized to different LLMs with high compression rates. Our experiments demonstrate that Gist-COCO outperforms previous prompt compression models in both passage and instruction compression tasks. Further analysis on gist verbalization results suggests that our gist prompts serve different functions in aiding LLMs. They may directly provide potential answers, generate the chain-of-thought, or simply repeat the inputs. All data and codes are available at https://github.com/OpenMatch/Gist-COCO .
- Gpt-4 technical report.
- Prompting is programming: A query language for large language models. Proceedings of the ACM on Programming Languages, (PLDI):1946–1969.
- Language models are few-shot learners. In Proceedings of NeurIPS.
- Unleashing the potential of prompt engineering in large language models: a comprehensive review. ArXiv preprint.
- Black-box prompt optimization: Aligning large language models without model training. ArXiv preprint.
- Adapting language models to compress contexts. In Proceedings of EMNLP, pages 3829–3846.
- Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
- Scaling instruction-finetuned language models.
- Avia Efrat and Omer Levy. 2020. The turking test: Can language models understand instructions?
- In-context autoencoder for context compression in a large language model.
- Peter D Grünwald. 2007. The minimum description length principle.
- Retrieval augmented language model pre-training. In Proceedings of ICML, pages 3929–3938.
- Codecot and beyond: Learning to program and test like a developer.
- Few-shot learning with retrieval augmented language models. J. Mach. Learn. Res., 24:251:1–251:43.
- Can large language models truly understand prompts? a case study with negated prompts. In Transfer Learning for Natural Language Processing Workshop, pages 52–62. PMLR.
- TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In Proceedings of ACL, pages 1601–1611.
- Challenges and applications of large language models.
- Active instruction tuning: Improving cross-task generalization by training on prompt sensitive tasks. In Proceedings of EMNLP, pages 1813–1829.
- Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, pages 452–466.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of ACL, pages 7871–7880.
- Structured chain-of-thought prompting for code generation.
- Yucheng Li. 2023. Unlocking context constraints of llms: Enhancing context efficiency of llms with self-information-based content filtering.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, (9):1–35.
- Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of ACL, pages 8086–8098.
- When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Proceedings of ACL, pages 9802–9822.
- Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of EMNLP, pages 11048–11064.
- Cross-task generalization via natural language crowdsourcing instructions. In Proceedings of ACL, pages 3470–3487.
- Learning to compress prompts with gist tokens.
- Ms marco: A human-generated machine reading comprehension dataset. In CoCo@ NIPs.
- OpenAI. 2022. Chatgpt.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, pages 27730–27744.
- KILT: a benchmark for knowledge intensive language tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2523–2544.
- In-context retrieval-augmented language models.
- Multitask prompted training enables zero-shot task generalization. In International Conference on Learning Representations.
- Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, (3):379–423.
- Replug: Retrieval-augmented black-box language models.
- Learning by distilling context.
- Stanford alpaca: An instruction-following llama model.
- A comprehensive survey of hallucination mitigation techniques in large language models.
- Llama: Open and efficient foundation language models.
- Attention is all you need. In Proceedings of NeurIPS, pages 5998–6008.
- Exploring neural models for query-focused summarization. In Findings of the Association for Computational Linguistics: NAACL, pages 1455–1468.
- Explore-instruct: Enhancing domain-specific instruction coverage through active exploration. In Proceedings of EMNLP, pages 9435–9454.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of ACL, pages 13484–13508.
- Super-naturalinstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of EMNLP, pages 5085–5109.
- Finetuned language models are zero-shot learners. In The Tenth International Conference on Learning Representations, ICLR.
- Emergent abilities of large language models. Transactions on Machine Learning Research.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, pages 24824–24837.
- Huggingface’s transformers: State-of-the-art natural language processing.
- Self-adaptive in-context learning: An information compression perspective for in-context example selection and ordering. In Proceedings of ACL, pages 1423–1436.
- Recomp: Improving retrieval-augmented lms with compression and selective augmentation.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of EMNLP, pages 2369–2380.
- Investigating the effectiveness of task-agnostic prefix prompt for instruction following. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following.
- Openmatch-v2: An all-in-one multi-modality plm-based information retrieval toolkit. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3160–3164.
- Augmentation-adapted retriever improves generalization of language models as generic plug-in. In Proceedings of ACL, pages 2421–2436.
- A survey of large language models.
- Large language models are human-level prompt engineers. In The Eleventh International Conference on Learning Representations.
- Xinze Li (34 papers)
- Zhenghao Liu (77 papers)
- Chenyan Xiong (95 papers)
- Shi Yu (37 papers)
- Yukun Yan (39 papers)
- Shuo Wang (382 papers)
- Ge Yu (63 papers)