Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chain-of-Specificity: An Iteratively Refining Method for Eliciting Knowledge from Large Language Models (2402.15526v1)

Published 20 Feb 2024 in cs.AI and cs.LG

Abstract: LLMs exhibit remarkable generative capabilities, enabling the generation of valuable information. Despite these advancements, previous research found that LLMs sometimes struggle with adhering to specific constraints (e.g., in specific place or at specific time), at times even overlooking them, which leads to responses that are either too generic or not fully satisfactory. Existing approaches attempted to address this issue by decomposing or rewriting input instructions, yet they fall short in adequately emphasizing specific constraints and in unlocking the underlying knowledge (e.g., programming within the context of software development). In response, this paper proposes a simple yet effective method named Chain-of-Specificity (CoS). Specifically, CoS iteratively emphasizes the specific constraints in the input instructions, unlocks knowledge within LLMs, and refines responses. Experiments conducted on publicly available and self-build complex datasets demonstrate that CoS outperforms existing methods in enhancing generated content especially for the specificity. Besides, as the number of specific constraints increase, other baselines falter, while CoS still performs well. Moreover, we show that distilling responses generated by CoS effectively enhances the ability of smaller models to follow the constrained instructions. Resources of this paper will be released for further research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Robert P Abelson. 2014. Script processing in attitude formation and decision making. In Cognition and social behavior, pages 33–45. Psychology Press.
  2. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  3. Phoenix: Democratizing chatgpt across languages. CoRR, abs/2304.10453.
  4. Black-box prompt optimization: Aligning large language models without model training. CoRR, abs/2311.04155.
  5. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  6. Rephrase and respond: Let large language models ask better questions for themselves. CoRR, abs/2311.04205.
  7. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
  8. The measurement of interrater agreement. Statistical methods for rates and proportions, 2(212-236):22–23.
  9. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 9118–9147. PMLR.
  10. Verifying plans and scripts for robotics tasks using performance level profiles. In Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling, ICAPS 2021, Guangzhou, China (virtual), August 2-13, 2021, pages 673–681. AAAI Press.
  11. OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
  12. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  13. Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, pages 3505–3506. ACM.
  14. proscript: Partially ordered scripts generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021, pages 2138–2149. Association for Computational Linguistics.
  15. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288.
  16. Explore-instruct: Enhancing domain-specific instruction coverage through active exploration. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 9435–9454. Association for Computational Linguistics.
  17. Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. arXiv preprint arXiv:2305.04091.
  18. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
  19. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  20. Probase: a probabilistic taxonomy for text understanding. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 481–492. ACM.
  21. Re-reading improves reasoning in language models. CoRR, abs/2309.06275.
  22. Large language models as optimizers. CoRR, abs/2309.03409.
  23. Induce, edit, retrieve: Language grounded multimodal schema for instructional video retrieval. CoRR, abs/2111.09276.
  24. Generate rather than retrieve: Large language models are strong context generators. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
  25. Distilling script knowledge from large language models for constrained language planning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 4303–4325. Association for Computational Linguistics.
  26. Judging llm-as-a-judge with mt-bench and chatbot arena. CoRR, abs/2306.05685.
  27. Least-to-most prompting enables complex reasoning in large language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Kaiwen Wei (6 papers)
  2. Jingyuan Zhang (50 papers)
  3. Hongzhi Zhang (33 papers)
  4. Fuzheng Zhang (60 papers)
  5. Di Zhang (231 papers)
  6. Li Jin (69 papers)
  7. Yue Yu (343 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com