Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action (2405.17822v1)
Abstract: We present a Conversational Chain-of-Action (Conv-CoA) framework for Open-domain Conversational Question Answering (OCQA). Compared with literature, Conv-CoA addresses three major challenges: (i) unfaithful hallucination that is inconsistent with real-time or domain facts, (ii) weak reasoning performance in conversational scenarios, and (iii) unsatisfying performance in conversational information retrieval. Our key contribution is a dynamic reasoning-retrieval mechanism that extracts the intent of the question and decomposes it into a reasoning chain to be solved via systematic prompting, pre-designed actions, updating the Contextual Knowledge Set (CKS), and a novel Hopfield-based retriever. Methodologically, we propose a resource-efficiency Hopfield retriever to enhance the efficiency and accuracy of conversational information retrieval within our actions. Additionally, we propose a conversational-multi-reference faith score (Conv-MRFS) to verify and resolve conflicts between retrieved knowledge and answers in conversations. Empirically, we conduct comparisons between our framework and 23 state-of-the-art methods across five different research directions and two public benchmarks. These comparisons demonstrate that our Conv-CoA outperforms other methods in both the accuracy and efficiency dimensions.
- Topiocqa: Open-domain conversational question answering with topic switching. Transactions of the Association for Computational Linguistics, 10:468–483.
- Open-domain question answering goes conversational via question rewriting. arXiv preprint arXiv:2010.04898.
- Conformal prediction for time series with modern hopfield networks. Advances in Neural Information Processing Systems, 36.
- Quantizable transformers: Removing outliers by helping attention heads do nothing. Advances in Neural Information Processing Systems, 36.
- Johannes Brandstetter. 2021. Blog post: Hopfield networks is all you need. Accessed: April 4, 2023.
- Generalizing conversational dense retrieval via llm-cognition data augmentation. arXiv preprint arXiv:2402.07092.
- Retrieving $k$-nearest memories with modern hopfield networks. In Associative Memory & Hopfield Networks in 2023.
- On a model of associative memory with huge storage capacity. Journal of Statistical Physics, 168:288–299.
- Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale. Advances in Neural Information Processing Systems, 35:30318–30332.
- Bert: Pre-training of deep bidirectional transformers for language understanding.
- Cloob: Modern hopfield networks with infoloob outperform clip. Advances in neural information processing systems, 35:20450–20468.
- Energy transformer. arXiv preprint arXiv:2302.07253.
- John J Hopfield. 1982. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8):2554–2558.
- John J Hopfield. 1984. Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the national academy of sciences, 81(10):3088–3092.
- Outlier-efficient hopfield layers for large transformer-based models.
- Nonparametric modern hopfield models.
- On computational limits of modern hopfield models: A fine-grained complexity analysis. arXiv preprint arXiv:2402.04520.
- On sparse modern hopfield model. In Thirty-seventh Conference on Neural Information Processing Systems.
- Itercqr: Iterative conversational query reformulation without human supervision. arXiv preprint arXiv:2311.09820.
- Instructor: Instructing unsupervised conversational dense retrieval with large language models. In The 2023 Conference on Empirical Methods in Natural Language Processing.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.
- Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
- Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive NLP. arXiv preprint arXiv:2212.14024.
- Sungdong Kim and Gangwoo Kim. 2022. Saving dense retriever from shortcut dependency in conversational search. arXiv preprint arXiv:2202.07280.
- Building transformers from neurons and astrocytes. bioRxiv, pages 2022–10.
- Dmitry Krotov and John J Hopfield. 2016. Dense associative memory for pattern recognition. Advances in neural information processing systems, 29.
- Dmitry Krotov and John J. Hopfield. 2021. Large associative memory problem in neurobiology and machine learning. In International Conference on Learning Representations.
- Contextualized query embeddings for conversational search. arXiv preprint arXiv:2104.08707.
- Conversational question reformulation via sequence-to-sequence architectures and pretrained language models. arXiv preprint arXiv:2004.01909.
- Learning denoised and interpretable session representation for conversational search. In Proceedings of the ACM Web Conference 2023, pages 3193–3202.
- Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In International conference on machine learning, pages 1614–1623. PMLR.
- Convgqr: Generative query reformulation for conversational search. arXiv preprint arXiv:2305.15645.
- Learning to relate to previous turns in conversational search. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1722–1732.
- History-aware conversational dense retrieval. arXiv preprint arXiv:2401.16659.
- Convsdg: Session data generation for conversational search. arXiv preprint arXiv:2403.11335.
- OpenAI. 2023. Gpt-4 technical report.
- Chain-of-action: Faithful and multimodal question answering through large language models.
- Measuring and narrowing the compositionality gap in language models. arXiv preprint arXiv:2210.03350.
- Hopfield networks is all you need. arXiv preprint arXiv:2008.02217.
- Knowledge-aware language model pretraining. arXiv preprint arXiv:2007.00655.
- Language models are greedy reasoners: A systematic formal analysis of chain-of-thought. arXiv preprint arXiv:2210.01240.
- Toolformer: Language models can teach themselves to use tools.
- Context-enriched molecule representations improve few-shot drug discovery. In The Eleventh International Conference on Learning Representations.
- Improving few-and zero-shot reaction template prediction using modern hopfield networks. Journal of chemical information and modeling, 62(9):2111–2120.
- Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
- Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc.
- Modern hopfield networks and attention for immune repertoire classification. Advances in Neural Information Processing Systems, 33:18832–18845.
- Uniform memory retrieval with larger capacity for modern hopfield models.
- Stanhop: Sparse tandem hopfield model for memory-enhanced time series prediction. arXiv preprint arXiv:2312.17346.
- Conqrr: Conversational query rewriting for retrieval with reinforcement learning. arXiv preprint arXiv:2112.08558.
- Bishop: Bi-directional cellular learning for tabular data with generalized sparse modern hopfield model.
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
- ReAct: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR).
- Enhancing conversational search: Large language model-aided informative query rewriting. arXiv preprint arXiv:2310.09716.
- Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.