Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs (2410.23875v1)

Published 31 Oct 2024 in cs.AI

Abstract: LLMs have shown remarkable reasoning capabilities on complex tasks, but they still suffer from out-of-date knowledge, hallucinations, and opaque decision-making. In contrast, Knowledge Graphs (KGs) can provide explicit and editable knowledge for LLMs to alleviate these issues. Existing paradigm of KG-augmented LLM manually predefines the breadth of exploration space and requires flawless navigation in KGs. However, this paradigm cannot adaptively explore reasoning paths in KGs based on the question semantics and self-correct erroneous reasoning paths, resulting in a bottleneck in efficiency and effect. To address these limitations, we propose a novel self-correcting adaptive planning paradigm for KG-augmented LLM named Plan-on-Graph (PoG), which first decomposes the question into several sub-objectives and then repeats the process of adaptively exploring reasoning paths, updating memory, and reflecting on the need to self-correct erroneous reasoning paths until arriving at the answer. Specifically, three important mechanisms of Guidance, Memory, and Reflection are designed to work together, to guarantee the adaptive breadth of self-correcting planning for graph reasoning. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of PoG.

Plan-on-Graph: Self-Correcting Adaptive Planning of LLMs on Knowledge Graphs

The paper introduces "Plan-on-Graph" (PoG), a paradigm shift for enhancing LLMs with Knowledge Graphs (KGs). The integration of KGs is highlighted as a promising approach to address persistent limitations in LLMs, such as outdated knowledge, hallucinatory outputs, and opaque decision-making. Conventional approaches rely on manually predefined KG exploration spaces, which may lead to inefficiencies and errors when the requisite breadth or paths are misaligned with the query semantics. The authors present a self-correcting adaptive planning mechanism designed to iterate over reasoning paths with flexibility and responsiveness to improve the reliability and effectiveness of LLM-KG interactions.

Core Contributions

The paper outlines several innovative components:

  • Guidance Mechanism: PoG begins by decomposing the user question into sub-objectives, each representing conditions derived from the query. This approach guides the reasoning process with a structured pathway, allowing dynamic adjustment of the exploration breadth in KGs.
  • Memory Mechanism: Information retrieval and reasoning paths are continuously recorded to facilitate historical analysis and reflection. The memory captures three aspects: subgraph structures, computed reasoning paths, and current statuses of sub-objectives. This aids in maintaining a coherent state across iterations, allowing the LLM to understand and adapt based on evolving data.
  • Reflection Mechanism: Reflection is key to PoG's self-correction capability. The LLM is prompted to evaluate whether the currently explored paths adequately address the sub-objectives. If the existing paths are deemed insufficient, reflection guides the LLM to correct its course by revisiting prior structures recorded in memory and undertaking alternative explorations.

Experimental Validation and Numerical Results

Extensive experiments verify PoG's effectiveness using three datasets: ComplexWebQuestions (CWQ), WebQuestionsSP (WebQSP), and GrailQA, all relying on Freebase knowledge. PoG demonstrates superior performance metrics particularly noted in non-I.I.D. tasks, suggesting its robustness in variable contexts. For instance, on CWQ, PoG achieved a noteworthy performance increase relative to existing KG-augmented methods. These results suggest that adaptive and self-corrective reasoning frameworks significantly enhance LLM capabilities when interfacing with large-scale, structured datasets like KGs.

Potential Implications and Future Directions

Practically, PoG's framework could be transformative in domains requiring complex reasoning over rapidly evolving datasets, such as real-time data processing and dynamic decision-making systems. Theoretically, it advances the field by bridging semantic reasoning in LLMs and structured data navigation, setting a precedent for future explorations in hybrid AI systems.

Future directions could include refining confidence assessments in LLM evaluations, to better anticipate and manage uncertainties in path exploration. Another avenue could explore minimizing requisite computational steps, improving efficiency without sacrificing the precision of outcomes. Moreover, addressing non-standardized or incomplete queries remains a fertile ground for enhancing the adaptability of such systems.

In conclusion, "Plan-on-Graph" marks a significant advance in addressing the adaptability and reliability issues of LLMs interfacing with KGs, ensuring that both theoretical frameworks and practical applications are poised for enhanced performance in complex reasoning tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Dbpedia: A nucleus for a web of open data. In international semantic web conference, pages 722–735. Springer, 2007.
  2. Using large language models for zero-shot natural language generation from knowledge graphs. In Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023), pages 39–54, 2023.
  3. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering. In The 61st Annual Meeting Of The Association For Computational Linguistics, 2023.
  4. Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024.
  5. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  7. Pay more attention to relation exploration for knowledge base question answering. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2119–2136, 2023.
  8. Entity summarization via exploiting description complementarity and salience. IEEE Transactions on Neural Networks and Learning Systems, 34(11):8297–8309, 2023.
  9. Mmea: Entity alignment for multi-modal knowledge graph. In International Conference on Knowledge Science, Engineering and Management, pages 134–147. Springer, 2020.
  10. Multi-modal siamese network for entity alignment. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 118–126, 2022.
  11. Collaboration-aware hybrid learning for knowledge development prediction. In Proceedings of the ACM on Web Conference 2024, pages 3976–3985, 2024.
  12. Tackling uncertain correspondences for multi-modal entity alignment. In Proceedings of the 38th Conference on Neural Information Processing Systems, 2024.
  13. A table-to-text framework with heterogeneous multidominance attention and self-evaluated multi-pass deliberation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 607–620, 2023.
  14. Pre-dygae: Pre-training enhanced dynamic graph autoencoder for occupational skill demand forecasting. In Proceedings of the 33th International Joint Conference on Artificial Intelligence, 2024.
  15. Graph reasoning enhanced language models for text-to-sql. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2447–2451, 2024.
  16. Don’t generate, discriminate: A proposal for grounding language models to real-world environments. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4928–4949, 2023.
  17. Beyond iid: three levels of generalization for question answering on knowledge bases. In Proceedings of the Web Conference 2021, pages 3477–3488, 2021.
  18. Chain-of-thought improves text generation with citations in large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 18345–18353, 2024.
  19. Structgpt: A general framework for large language model to reason over structured data. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9237–9251, 2023.
  20. Unikgqa: Unified retrieval and reasoning for solving multi-hop question answering over knowledge graph. In The International Conference on Learning Representations, 2023.
  21. Prospector: Improving llm agents with self-asking and trajectory ranking. 2023.
  22. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  23. Few-shot in-context learning on knowledge base question answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6966–6980, 2023.
  24. Mot: Memory-of-thought enables chatgpt to self-improve. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6354–6374, 2023.
  25. Unigen: A unified generative framework for retrieval and question answering with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 8688–8696, 2024.
  26. Flexkbqa: A flexible llm-powered framework for few-shot knowledge base question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 18608–18616, 2024.
  27. Reasoning on graphs: Faithful and interpretable large language model reasoning. In The Twelfth International Conference on Learning Representations, 2024.
  28. Self-refine: Iterative refinement with self-feedback. Advances in Neural Information Processing Systems, 36, 2024.
  29. Selfcheck: Using llms to zero-shot check their own step-by-step reasoning. In The Twelfth International Conference on Learning Representations, 2024.
  30. Skeleton-of-thought: Large language models can do parallel decoding. In The Twelfth International Conference on Learning Representations, 2024.
  31. V Sanh. Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
  32. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36, 2024.
  33. Distribution shifts are bottlenecks: Extensive evaluation for grounding language models to knowledge bases. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 71–88, 2024.
  34. Tiara: Multi-grained retrieval for robust question answering over large knowledge base. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8108–8121, 2022.
  35. Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph. In The Twelfth International Conference on Learning Representations, 2024.
  36. Large-scale online job search behaviors reveal labor market shifts amid covid-19. Nature Cities, 1(2):150–163, 2024.
  37. The web as a knowledge-base for answering complex questions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 641–651, 2018.
  38. Musegraph: Graph-oriented instruction tuning of large language models for generic graph mining. arXiv preprint arXiv:2403.04780, 2024.
  39. Graphgpt: Graph instruction tuning for large language models. arXiv preprint arXiv:2310.13023, 2023.
  40. Boosting language models reasoning with chain-of-knowledge prompting. arXiv preprint arXiv:2306.06427, 2023.
  41. Knowledge-driven cot: Exploring faithful reasoning in llms for knowledge-intensive question answering. arXiv preprint arXiv:2308.13259, 2023.
  42. T-sciq: Teaching multimodal chain-of-thought reasoning via large language model signals for science question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19162–19170, 2024.
  43. Unleashing the power of knowledge graph for recommendation via invariant learning. In Proceedings of the ACM on Web Conference 2024, pages 3745–3755, 2024.
  44. Dynamic sparse learning: A novel paradigm for efficient recommendation. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, pages 740–749, 2024.
  45. Flagvne: A flexible and generalizable rl framework for network resource allocation. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence, 2024.
  46. Joint admission control and resource allocation of virtual network embedding via hierarchical deep reinforcement learning. IEEE Transactions on Services Computing, 17(03):1001–1015, 2024.
  47. Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9:176–194, 2021.
  48. Self-consistency improves chain of thought reasoning in language models. In International Conference on Learning Representations, 2023.
  49. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.
  50. Linking the characters: Video-oriented social graph generation via hierarchical-cumulative gcn. In Proceedings of the 29th ACM International Conference on Multimedia, pages 4716–4724, 2021.
  51. Afdgcf: Adaptive feature de-correlation graph collaborative filtering for recommendations. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1242–1252, 2024.
  52. Interactive-kbqa: Multi-turn interactions for knowledge base question answering with large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers, pages 10561–10582, 2024.
  53. Kg-bert: Bert for knowledge graph completion. arXiv preprint arXiv:1909.03193, 2019.
  54. Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36, 2024.
  55. Rng-kbqa: Generation augmented iterative ranking for knowledge base question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6032–6043, 2022.
  56. The value of semantic parse labeling for knowledge base question answering. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 201–206, 2016.
  57. Decaf: Joint decoding of answers and logical forms for question answering over knowledge bases. In The International Conference on Learning Representations, 2023.
  58. Fc-kbqa: A fine-to-coarse composition framework for knowledge base question answering. In The 61st Annual Meeting Of The Association For Computational Linguistics, 2023.
  59. Temporal graph contrastive learning for sequential recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 9359–9367, 2024.
  60. Triple dual learning for opinion-based explainable recommendation. ACM Transactions on Information Systems, 42(3):1–27, 2023.
  61. Ernie: Enhanced language representation with informative entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1441–1451, 2019.
  62. Comi: Correct and mitigate shortcut learning behavior in deep neural networks. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 218–228, 2024.
  63. Least-to-most prompting enables complex reasoning in large language models. The International Conference on Learning Representations, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Liyi Chen (15 papers)
  2. Panrong Tong (3 papers)
  3. Zhongming Jin (13 papers)
  4. Ying Sun (154 papers)
  5. Jieping Ye (169 papers)
  6. Hui Xiong (244 papers)
Youtube Logo Streamline Icon: https://streamlinehq.com