Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 164 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 72 tok/s Pro
Kimi K2 204 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action (2405.17822v1)

Published 28 May 2024 in cs.CL and cs.AI

Abstract: We present a Conversational Chain-of-Action (Conv-CoA) framework for Open-domain Conversational Question Answering (OCQA). Compared with literature, Conv-CoA addresses three major challenges: (i) unfaithful hallucination that is inconsistent with real-time or domain facts, (ii) weak reasoning performance in conversational scenarios, and (iii) unsatisfying performance in conversational information retrieval. Our key contribution is a dynamic reasoning-retrieval mechanism that extracts the intent of the question and decomposes it into a reasoning chain to be solved via systematic prompting, pre-designed actions, updating the Contextual Knowledge Set (CKS), and a novel Hopfield-based retriever. Methodologically, we propose a resource-efficiency Hopfield retriever to enhance the efficiency and accuracy of conversational information retrieval within our actions. Additionally, we propose a conversational-multi-reference faith score (Conv-MRFS) to verify and resolve conflicts between retrieved knowledge and answers in conversations. Empirically, we conduct comparisons between our framework and 23 state-of-the-art methods across five different research directions and two public benchmarks. These comparisons demonstrate that our Conv-CoA outperforms other methods in both the accuracy and efficiency dimensions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Topiocqa: Open-domain conversational question answering with topic switching. Transactions of the Association for Computational Linguistics, 10:468–483.
  2. Open-domain question answering goes conversational via question rewriting. arXiv preprint arXiv:2010.04898.
  3. Conformal prediction for time series with modern hopfield networks. Advances in Neural Information Processing Systems, 36.
  4. Quantizable transformers: Removing outliers by helping attention heads do nothing. Advances in Neural Information Processing Systems, 36.
  5. Johannes Brandstetter. 2021. Blog post: Hopfield networks is all you need. Accessed: April 4, 2023.
  6. Generalizing conversational dense retrieval via llm-cognition data augmentation. arXiv preprint arXiv:2402.07092.
  7. Retrieving $k$-nearest memories with modern hopfield networks. In Associative Memory & Hopfield Networks in 2023.
  8. On a model of associative memory with huge storage capacity. Journal of Statistical Physics, 168:288–299.
  9. Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale. Advances in Neural Information Processing Systems, 35:30318–30332.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding.
  11. Cloob: Modern hopfield networks with infoloob outperform clip. Advances in neural information processing systems, 35:20450–20468.
  12. Energy transformer. arXiv preprint arXiv:2302.07253.
  13. John J Hopfield. 1982. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8):2554–2558.
  14. John J Hopfield. 1984. Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the national academy of sciences, 81(10):3088–3092.
  15. Outlier-efficient hopfield layers for large transformer-based models.
  16. Nonparametric modern hopfield models.
  17. On computational limits of modern hopfield models: A fine-grained complexity analysis. arXiv preprint arXiv:2402.04520.
  18. On sparse modern hopfield model. In Thirty-seventh Conference on Neural Information Processing Systems.
  19. Itercqr: Iterative conversational query reformulation without human supervision. arXiv preprint arXiv:2311.09820.
  20. Instructor: Instructing unsupervised conversational dense retrieval with large language models. In The 2023 Conference on Empirical Methods in Natural Language Processing.
  21. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.
  22. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
  23. Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive NLP. arXiv preprint arXiv:2212.14024.
  24. Sungdong Kim and Gangwoo Kim. 2022. Saving dense retriever from shortcut dependency in conversational search. arXiv preprint arXiv:2202.07280.
  25. Building transformers from neurons and astrocytes. bioRxiv, pages 2022–10.
  26. Dmitry Krotov and John J Hopfield. 2016. Dense associative memory for pattern recognition. Advances in neural information processing systems, 29.
  27. Dmitry Krotov and John J. Hopfield. 2021. Large associative memory problem in neurobiology and machine learning. In International Conference on Learning Representations.
  28. Contextualized query embeddings for conversational search. arXiv preprint arXiv:2104.08707.
  29. Conversational question reformulation via sequence-to-sequence architectures and pretrained language models. arXiv preprint arXiv:2004.01909.
  30. Learning denoised and interpretable session representation for conversational search. In Proceedings of the ACM Web Conference 2023, pages 3193–3202.
  31. Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In International conference on machine learning, pages 1614–1623. PMLR.
  32. Convgqr: Generative query reformulation for conversational search. arXiv preprint arXiv:2305.15645.
  33. Learning to relate to previous turns in conversational search. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1722–1732.
  34. History-aware conversational dense retrieval. arXiv preprint arXiv:2401.16659.
  35. Convsdg: Session data generation for conversational search. arXiv preprint arXiv:2403.11335.
  36. OpenAI. 2023. Gpt-4 technical report.
  37. Chain-of-action: Faithful and multimodal question answering through large language models.
  38. Measuring and narrowing the compositionality gap in language models. arXiv preprint arXiv:2210.03350.
  39. Hopfield networks is all you need. arXiv preprint arXiv:2008.02217.
  40. Knowledge-aware language model pretraining. arXiv preprint arXiv:2007.00655.
  41. Language models are greedy reasoners: A systematic formal analysis of chain-of-thought. arXiv preprint arXiv:2210.01240.
  42. Toolformer: Language models can teach themselves to use tools.
  43. Context-enriched molecule representations improve few-shot drug discovery. In The Eleventh International Conference on Learning Representations.
  44. Improving few-and zero-shot reaction template prediction using modern hopfield networks. Journal of chemical information and modeling, 62(9):2111–2120.
  45. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
  46. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc.
  47. Modern hopfield networks and attention for immune repertoire classification. Advances in Neural Information Processing Systems, 33:18832–18845.
  48. Uniform memory retrieval with larger capacity for modern hopfield models.
  49. Stanhop: Sparse tandem hopfield model for memory-enhanced time series prediction. arXiv preprint arXiv:2312.17346.
  50. Conqrr: Conversational query rewriting for retrieval with reinforcement learning. arXiv preprint arXiv:2112.08558.
  51. Bishop: Bi-directional cellular learning for tabular data with generalized sparse modern hopfield model.
  52. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  53. ReAct: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR).
  54. Enhancing conversational search: Large language model-aided informative query rewriting. arXiv preprint arXiv:2310.09716.
  55. Least-to-most prompting enables complex reasoning in large language models. arXiv preprint arXiv:2205.10625.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: