Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-hop Question Answering under Temporal Knowledge Editing (2404.00492v1)

Published 30 Mar 2024 in cs.CL, cs.AI, and cs.LG
Multi-hop Question Answering under Temporal Knowledge Editing

Abstract: Multi-hop question answering (MQA) under knowledge editing (KE) has garnered significant attention in the era of LLMs. However, existing models for MQA under KE exhibit poor performance when dealing with questions containing explicit temporal contexts. To address this limitation, we propose a novel framework, namely TEMPoral knowLEdge augmented Multi-hop Question Answering (TEMPLE-MQA). Unlike previous methods, TEMPLE-MQA first constructs a time-aware graph (TAG) to store edit knowledge in a structured manner. Then, through our proposed inference path, structural retrieval, and joint reasoning stages, TEMPLE-MQA effectively discerns temporal contexts within the question query. Experiments on benchmark datasets demonstrate that TEMPLE-MQA significantly outperforms baseline models. Additionally, we contribute a new dataset, namely TKEMQA, which serves as the inaugural benchmark tailored specifically for MQA with temporal scopes.

Enhancing Multi-Hop Question Answering with Temporal Knowledge Using Temple-MQA

Introduction

The paper focuses on multi-hop question answering (MQA) that involves knowledge editing (KE), particularly under scenarios that require managing temporal knowledge edits efficiently. Existing methods encounter significant difficulties when handling MQA that demand awareness of temporal contexts. The proposed Temple-MQA framework refines this process through the integration of a time-aware graph (TAG), which organically handles the ripple effects of knowledge edits over time, crucially maintaining context and preventing the common LLM pitfall of hallucination.

Addressing the Limitations of Existing Approaches

The primary challenge addressed by Temple-MQA is the ineffective handling of temporal information in existing MQA models that utilize knowledge editing. The conventional dense retrieval systems used in KE do not structure information temporally, often leading to mismatched or outdated data being retrieved. This issue is amplified with questions that explicitly reference temporal contexts, where the retrieval mechanism's limitations become particularly glaring, as illustrated in various comparative experiments in the paper.

Temple-MQA Framework

Temple-MQA introduces several innovative components to tackle these issues:

  1. Time-Aware Graph (TAG): By creating a structured graph that maps knowledge edits with their respective temporal contexts, Temple-MQA ensures more precise data retrieval.
  2. Improved Retrieval Process: Includes data augmentation techniques for better entity recognition and disambiguation, alongside the use of context-dependent filters to enhance retrieval accuracy.
  3. Joint Reasoning and Inference Path Planning: Utilizes LLMs to ideate an inference path for querying the system effectively, allowing coherent, step-by-step reasoning that respects the structured nature of TAG.
  4. Evaluation and Dataset Contribution: Extensive tests on benchmark datasets validate Temple-MQA's superior performance. Furthermore, the introduction of a new dataset, TKeMqa, tailored for temporal MQA, enriches the research landscape.

Experimental Validation

Temple-MQA demonstrates significant improvements over the seven existing baseline models across different evaluative metrics. These enhancements are evident in scenarios that involve complex temporal constraints and large volumes of data edits where traditional models struggle. The newly proposed TKeMqa dataset also serves as a robust platform for testing MQA models' efficacy in handling explicit temporal knowledge.

Conclusions and Future Work

The research delineates a clear path forward for integrating structured temporal data handling within LLM-driven MQA frameworks. The introduction of the TAG component within Temple-MQA not only refines the retrieval of edited knowledge but also sets a precedent for future explorations into more context-aware AI-driven question answering systems. Future studies might explore automated optimizations of TAG construction and real-time adaptation to new knowledge edits, potentially expanding the model's applicability across various dynamically changing information domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Fine-grained named entity typing over distantly supervised data based on refined representations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34(05), pp.  7391–7398, 2020.
  2. Fine-grained named entity typing over distantly supervised data via refinement in hyperbolic space. arXiv preprint arXiv:2101.11212, 2021.
  3. Editing factual knowledge in language models. In Conference on Empirical Methods in Natural Language Processing, 2021. URL https://api.semanticscholar.org/CorpusID:233289412.
  4. Recall and learn: Fine-tuning deep pretrained language models with less forgetting. In Conference on Empirical Methods in Natural Language Processing, 2020. URL https://api.semanticscholar.org/CorpusID:216553067.
  5. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023. URL https://lmsys.org/blog/2023-03-30-vicuna/.
  6. Evaluating the ripple effects of knowledge editing in language models. ArXiv, abs/2307.12976, 2023. URL https://api.semanticscholar.org/CorpusID:260356612.
  7. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pp.  265–284. Springer, 2006.
  8. Pokemqa: Programmable knowledge editing for multi-hop question answering. arXiv preprint arXiv:2312.15194, 2023.
  9. Aging with grace: Lifelong model editing with discrete key-value adaptors. ArXiv, abs/2211.11031, 2022. URL https://api.semanticscholar.org/CorpusID:253735429.
  10. Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs. ArXiv, abs/2111.13654, 2021. URL https://api.semanticscholar.org/CorpusID:244709666.
  11. Faithful question answering with monte-carlo planning. In Annual Meeting of the Association for Computational Linguistics, 2023. URL https://api.semanticscholar.org/CorpusID:258479954.
  12. Wilke: Wise-layer knowledge editor for lifelong knowledge editing. ArXiv, abs/2402.10987, 2024. URL https://api.semanticscholar.org/CorpusID:267751068.
  13. High dimensional differentially private stochastic optimization with heavy-tailed data. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp.  227–236, 2022.
  14. Differentially private natural language models: Recent advances and future directions. arXiv preprint arXiv:2301.09112, 2023a.
  15. Seat: stable and explainable attention. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37(11), pp.  12907–12915, 2023b.
  16. Privacy-preserving sparse generalized eigenvalue problem. In International Conference on Artificial Intelligence and Statistics, pp.  5052–5062. PMLR, 2023c.
  17. Meta-learning online adaptation of language models. ArXiv, abs/2305.15076, 2023d. URL https://api.semanticscholar.org/CorpusID:258866057.
  18. Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. ArXiv, abs/2212.10403, 2022. URL https://api.semanticscholar.org/CorpusID:254877753.
  19. See the unseen: Better context-consistent knowledge-editing by noises. ArXiv, abs/2401.07544, 2024. URL https://api.semanticscholar.org/CorpusID:266999106.
  20. Dense passage retrieval for open-domain question answering. ArXiv, abs/2004.04906, 2020. URL https://api.semanticscholar.org/CorpusID:215737187.
  21. Decomposed prompting: A modular approach for solving complex tasks. ArXiv, abs/2210.02406, 2022. URL https://api.semanticscholar.org/CorpusID:252715485.
  22. Faithful vision-language interpretation via concept bottleneck models. In The Twelfth International Conference on Learning Representations, 2023.
  23. Pmet: Precise model editing in a transformer. ArXiv, abs/2308.08742, 2023. URL https://api.semanticscholar.org/CorpusID:261030625.
  24. Fine-grained entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 26(1), pp.  94–100, 2012.
  25. Reasoning on graphs: Faithful and interpretable large language model reasoning. ArXiv, abs/2310.01061, 2023. URL https://api.semanticscholar.org/CorpusID:263605944.
  26. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372, 2022a.
  27. Mass-editing memory in a transformer. In The Eleventh International Conference on Learning Representations, 2022b.
  28. Fast model editing at scale. ArXiv, abs/2110.11309, 2021. URL https://api.semanticscholar.org/CorpusID:239050360.
  29. Memory-based model editing at scale. ArXiv, abs/2206.06520, 2022. URL https://api.semanticscholar.org/CorpusID:249642147.
  30. Van L Parsons. Stratified sampling. Wiley StatsRef: Statistics Reference Online, pp.  1–11, 2014.
  31. Faster rates of private stochastic convex optimization. In International Conference on Algorithmic Learning Theory, pp.  995–1002. PMLR, 2022.
  32. Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288, 2023. URL https://api.semanticscholar.org/CorpusID:259950998.
  33. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
  34. Differentially private (gradient) expectation maximization algorithm with statistical guarantees. arXiv preprint arXiv:2010.13520, 2020.
  35. Estimating smooth glm in non-interactive local differential privacy model with public unlabeled data. In Algorithmic Learning Theory, pp.  1207–1213. PMLR, 2021.
  36. Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. In Annual Meeting of the Association for Computational Linguistics, 2023a. URL https://api.semanticscholar.org/CorpusID:258558102.
  37. Knowledge editing for large language models: A survey. arXiv preprint arXiv:2310.16218, 2023b.
  38. Deepedit: Knowledge editing as decoding with constraints. ArXiv, abs/2401.10471, 2024. URL https://api.semanticscholar.org/CorpusID:267060897.
  39. Practical differentially private and byzantine-resilient federated learning. Proceedings of the ACM on Management of Data, 1(2):1–26, 2023.
  40. How does selection leak privacy: Revisiting private selection and improved results for hyper-parameter tuning. arXiv preprint arXiv:2402.13087, 2024.
  41. An llm can fool itself: A prompt-based adversarial attack. arXiv preprint arXiv:2310.13345, 2023.
  42. Moral: Moe augmented lora for llms’ lifelong learning. arXiv preprint arXiv:2402.11260, 2024a.
  43. Human-ai interactions in the communication era: Autophagy makes large models achieving local optima. arXiv preprint arXiv:2402.11271, 2024b.
  44. History matters: Temporal knowledge editing in large language model. ArXiv, abs/2312.05497, 2023. URL https://api.semanticscholar.org/CorpusID:266164354.
  45. A comprehensive study of knowledge editing for large language models. ArXiv, abs/2401.01286, 2024. URL https://api.semanticscholar.org/CorpusID:266725300.
  46. A survey of large language models. ArXiv, abs/2303.18223, 2023. URL https://api.semanticscholar.org/CorpusID:257900969.
  47. Can we edit factual knowledge by in-context learning? ArXiv, abs/2305.12740, 2023. URL https://api.semanticscholar.org/CorpusID:258832407.
  48. Mquake: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795, 2023.
  49. Modifying memories in transformer models. ArXiv, abs/2012.00363, 2020. URL https://api.semanticscholar.org/CorpusID:227238659.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Keyuan Cheng (9 papers)
  2. Gang Lin (3 papers)
  3. Haoyang Fei (2 papers)
  4. Yuxuan zhai (2 papers)
  5. Lu Yu (87 papers)
  6. Muhammad Asif Ali (18 papers)
  7. Lijie Hu (50 papers)
  8. Di Wang (407 papers)
Citations (12)