CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following (2403.03129v2)
Abstract: With the advancement of LMs, their exposure to private data is increasingly inevitable, and their deployment (especially for smaller ones) on personal devices, such as PCs and smartphones, has become a prevailing trend. In contexts laden with user information, enabling models to both safeguard user privacy and execute commands efficiently emerges as an essential research imperative. In this paper, we propose CoGenesis, a collaborative generation framework integrating large (hosted on cloud infrastructure) and small models (deployed on local devices) to address privacy concerns logically. Initially, we design a pipeline to create personalized writing instruction datasets enriched with extensive context details as the testbed of this research issue. Subsequently, we introduce two variants of CoGenesis based on sketch and logits respectively. Our experimental findings, based on our synthesized dataset and two additional open-source datasets, indicate that: 1) Large-scale models perform well when provided with user context but struggle in the absence of such context. 2) While specialized smaller models fine-tuned on the synthetic dataset show promise, they still lag behind their larger counterparts. 3) Our CoGenesis framework, utilizing mixed-scale models, showcases competitive performance, providing a feasible solution to privacy issues.
- 2024. Minicpm: Unveiling the potential of end-side large language models. OpenBMB Blog.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Qwen technical report. arXiv preprint arXiv:2309.16609.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology.
- Alpagasus: Training a better alpaca with fewer data. arXiv preprint arXiv:2307.08701.
- A customized text sanitization mechanism with differential privacy. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5747–5758.
- Ultrafeedback: Boosting language models with high-quality feedback. arXiv preprint arXiv:2310.01377.
- Challenges towards the next frontier in privacy. arXiv preprint arXiv:2304.06929.
- Enhancing chat language models by scaling high-quality instructional conversations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3029–3051, Singapore. Association for Computational Linguistics.
- Giorgio Franceschelli and Mirco Musolesi. 2023. On the creativity of large language models. arXiv preprint arXiv:2304.00008.
- Specializing smaller language models towards multi-step reasoning. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 10421–10430. PMLR.
- Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
- Specialized language models with cheap inference from limited domain data. arXiv preprint arXiv:2402.01093.
- Textbooks are all you need. arXiv preprint arXiv:2306.11644.
- Jennifer Haase and Paul HP Hanel. 2023. Artificial muses: Generative artificial intelligence chatbots have risen to human-level creativity. arXiv preprint arXiv:2303.12003.
- Mistral 7b. arXiv preprint arXiv:2310.06825.
- Mixtral of experts. arXiv preprint arXiv:2401.04088.
- Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770.
- Protecting user privacy in remote conversational systems: A privacy-preserving framework based on text sanitization. arXiv preprint arXiv:2306.08223.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.
- Fast inference from transformers via speculative decoding. In International Conference on Machine Learning, pages 19274–19286. PMLR.
- Contrastive decoding: Open-ended text generation as optimization. arXiv preprint arXiv:2210.15097.
- Tuning language models by proxy. arXiv preprint arXiv:2401.08565.
- Trustworthy llms: a survey and guideline for evaluating large language models’ alignment. arXiv preprint arXiv:2308.05374.
- An emulator for fine-tuning large language models using small language models. arXiv preprint arXiv:2310.12962.
- MLC team. 2023. MLC-LLM.
- Text embeddings reveal (almost) as much as text. In The 2023 Conference on Empirical Methods in Natural Language Processing.
- Language model inversion. In The Twelfth International Conference on Learning Representations.
- Sean O’Brien and Mike Lewis. 2023. Contrastive decoding improves reasoning in large language models. arXiv preprint arXiv:2309.09117.
- CombLM: Adapting black-box language models through small fine-tuned models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 2961–2974, Singapore. Association for Computational Linguistics.
- Learning interpretable style embeddings via prompting LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 15270–15290, Singapore. Association for Computational Linguistics.
- Communicative agents for software development. arXiv preprint arXiv:2307.07924.
- Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290.
- Lamp: When large language models meet personalization. arXiv preprint arXiv:2304.11406.
- Mitigating hallucinations and off-target machine translation with source-contrastive and language-contrastive decoding. arXiv preprint arXiv:2309.07098.
- H2o-danube-1.8 b technical report. arXiv preprint arXiv:2401.16818.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Zephyr: Direct distillation of lm alignment. arXiv preprint arXiv:2310.16944.
- Knowledge fusion of large language models.
- Improving text embeddings with large language models. arXiv preprint arXiv:2401.00368.
- Automated evaluation of personalized text generation using large language models. arXiv preprint arXiv:2310.11593.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13484–13508, Toronto, Canada. Association for Computational Linguistics.
- Same author or just same topic? towards content-independent style representations. In Proceedings of the 7th Workshop on Representation Learning for NLP, pages 249–268, Dublin, Ireland. Association for Computational Linguistics.
- Instructiongpt-4: A 200-instruction paradigm for fine-tuning minigpt-4. arXiv preprint arXiv:2308.12067.
- Jason Weston and Sainbayar Sukhbaatar. 2023. System 2 attention (is something you might need too). arXiv preprint arXiv:2311.11829.
- Privacy-preserving in-context learning for large language models. arXiv e-prints, pages arXiv–2305.
- Unlocking efficiency in large language model inference: A comprehensive survey of speculative decoding. arXiv preprint arXiv:2401.07851.
- Offsite-tuning: Transfer learning without full model. arXiv preprint arXiv:2302.04870.
- Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244.
- Baize: An open-source chat model with parameter-efficient tuning on self-chat data. arXiv preprint arXiv:2304.01196.
- A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. arXiv preprint arXiv:2312.02003.
- A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need? arXiv preprint arXiv:2303.11717.
- A survey of controllable text generation using transformer-based pre-trained language models. ACM Computing Surveys, 56(3):1–37.
- CRaSh: Clustering, removing, and sharing enhance fine-tuning without full large language model. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9612–9637, Singapore. Association for Computational Linguistics.
- Tinyllama: An open-source small language model. arXiv preprint arXiv:2401.02385.
- TextFusion: Privacy-preserving pre-trained model inference via token fusion. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8360–8371, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- TextObfuscator: Making pre-trained language model a privacy protector via obfuscating word representations. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5459–5473, Toronto, Canada. Association for Computational Linguistics.
Collections
Sign up for free to add this paper to one or more collections.