Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Don't Forget to Connect! Improving RAG with Graph-based Reranking (2405.18414v1)

Published 28 May 2024 in cs.CL, cs.AI, cs.LG, and cs.SI
Don't Forget to Connect! Improving RAG with Graph-based Reranking

Abstract: Retrieval Augmented Generation (RAG) has greatly improved the performance of LLM responses by grounding generation with context from existing documents. These systems work well when documents are clearly relevant to a question context. But what about when a document has partial information, or less obvious connections to the context? And how should we reason about connections between documents? In this work, we seek to answer these two core questions about RAG generation. We introduce G-RAG, a reranker based on graph neural networks (GNNs) between the retriever and reader in RAG. Our method combines both connections between documents and semantic information (via Abstract Meaning Representation graphs) to provide a context-informed ranker for RAG. G-RAG outperforms state-of-the-art approaches while having smaller computational footprint. Additionally, we assess the performance of PaLM 2 as a reranker and find it to significantly underperform G-RAG. This result emphasizes the importance of reranking for RAG even when using LLMs.

An Analysis of Retrieval Augmented Generation with GNN Rerankers in Open-Domain Question Answering

This paper introduces G-RAG, an advanced reranking mechanism aimed at enhancing Retrieval Augmented Generation (RAG) systems for Open-Domain Question Answering (ODQA). RAG enhances LLM outputs by integrating context from retrieved documents. However, traditional RAG struggles when dealing with documents that partially relate to the query or possess implicit connections requiring a sophisticated understanding of document interrelationships. G-RAG addresses these shortcomings by leveraging Graph Neural Networks (GNNs) to model and exploit connections across documents, incorporating Abstract Meaning Representation (AMR) graphs for semantic depth.

Methodology: Graph-Based Reranking

The core of this paper lies in its novel use of document graphs and AMR graphs to inform reranking processes within RAG frameworks. The approach involves several key steps:

  1. Document Graph Construction: Document nodes are connected in an undirected graph based on shared concepts parsed through AMR. Edges between documents represent shared semantic content, ensuring that the graph reflects deeper inter-document relationships.
  2. AMR Parsing and Node Features: AMR graphs are generated using AMRBART, capturing the semantic connections between query-document pairs. Document embeddings are supplemented by AMR-derived paths to encode richer context.
  3. GNN Architecture: Node and edge features in the document graph are updated using GNNs. This setup employs the Graph Convolutional Network (GCN) architecture with Mean Aggregator and specific parameter tuning for optimal performance.
  4. Ranking Mechanism: The model applies pairwise ranking loss to recalibrate document ranks, subsequently improving retrieval accuracy. This approach expressly addresses the challenge of identifying partially relevant documents that other methods might overlook.

Evaluation Metrics

The paper introduces Mean Tied Reciprocal Ranking (MTRR) and Tied Mean Hits@10 (TMHits@10) as innovative metrics to better evaluate reranking accuracy, particularly under scenarios involving tied relevance scores. These metrics ensure a more realistic assessment of reranker performance.

Experimental Results

Experiments conducted on the Natural Questions (NQ) and TriviaQA (TQA) datasets demonstrate G-RAG's capabilities. Key findings include:

  • Performance: G-RAG-RL, which incorporates pairwise ranking loss, achieves superior performance compared to state-of-the-art methods like BART-GST and various baseline LLMs, sometimes showing improvements up to 7 percentage points in evaluation metrics.
  • Embedding Models: Utilizing recent embedding models such as Ember enhanced reranking results significantly. Hyperparameter tuning yielded even better results, highlighting the importance of optimizing model parameters.
  • LLMs as Rerankers: Experiments reveal that LLMs like PaLM~2 underperform in reranking tasks when applied naively without fine-tuning. The frequent occurrence of tied relevance scores with LLMs further underscores the necessity for specialized reranking strategies.

Theoretical and Practical Implications

The theoretical novelty of this research lies in merging graph-based document interrelations with reranking methods in ODQA. Practically, G-RAG has potential applications across various systems requiring precise information retrieval and context-aware generation.

Future Directions

Speculation on future advancements includes:

  • Refinement of GNN Architectures: Exploring more sophisticated GNN variants or hybrid models could further enhance reranking capabilities.
  • Advanced AMR Utilization: More efficient integration of AMR information could optimize computational footprints while retaining semantic accuracy.
  • LLM Fine-Tuning: Adequate fine-tuning protocols for LLMs in reranking contexts could harness their generative prowess more effectively.

In conclusion, G-RAG introduces a robust, GNN-based reranking method that significantly advances the effectiveness of RAG systems in ODQA. By combining document interrelations with nuanced semantic representations, this research takes a pivotal step toward more accurate and context-aware LLM outputs. Future advancements in this domain are likely to build upon and refine these innovative strategies, promising further improvements in information retrieval and generation systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Learning to retrieve reasoning paths over wikipedia graph for question answering. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=SJgVHkrYDH.
  2. Semantic representation for dialogue modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4430–4445, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.acl-long.342. URL https://aclanthology.org/2021.acl-long.342.
  3. Graph pre-training for AMR parsing and generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6001–6015, Dublin, Ireland, May 2022. Association for Computational Linguistics. URL https://aclanthology.org/2022.acl-long.415.
  4. Abstract Meaning Representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 178–186, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. URL https://aclanthology.org/W13-2322.
  5. Leveraging knowledge graph for open-domain question answering. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pages 389–394. IEEE, 2018.
  6. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://www.aclweb.org/anthology/N19-1423.
  7. Re2G: Retrieve, rerank, generate. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2701–2715, Seattle, United States, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.194. URL https://aclanthology.org/2022.naacl-main.194.
  8. Exploiting edge features for graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9211–9219, 2019.
  9. Palm 2 technical report, 2023.
  10. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  11. Large language models are zero-shot rankers for recommender systems. In European Conference on Information Retrieval, pages 364–381. Springer, 2024.
  12. Dsqa-llm: Domain-specific intelligent question answering based on large language model. In International Conference on AI-generated Content, pages 170–180. Springer, 2023.
  13. Leveraging passage retrieval with generative models for open domain question answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 874–880, 2021.
  14. TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In Regina Barzilay and Min-Yen Kan, editors, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1601–1611, Vancouver, Canada, July 2017. Association for Computational Linguistics. doi: 10.18653/v1/P17-1147. URL https://aclanthology.org/P17-1147.
  15. Grape: Knowledge graph enhanced passage reader for open-domain question answering. In Findings of Empirical Methods in Natural Language Processing, 2022.
  16. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online, 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.550. URL https://aclanthology.org/2020.emnlp-main.550.
  17. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  18. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466, 2019.
  19. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online, July 2020a. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.703. URL https://www.aclweb.org/anthology/2020.acl-main.703.
  20. Retrieval-augmented generation for knowledge-intensive nlp tasks. NeurIPS, 33:9459–9474, 2020b.
  21. Improving pairwise ranking for multi-label image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3617–3625, 2017.
  22. Towards general text embeddings with multi-stage contrastive learning, 2023.
  23. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  24. RetroMAE: Pre-training retrieval-oriented transformers via masked auto-encoder. arXiv preprint arXiv:2205.12035, 2022.
  25. Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=Bkg6RiCqY7.
  26. Fine-tuning llama for multi-stage text retrieval. arXiv preprint arXiv:2310.08319, 2023a.
  27. Zero-shot listwise document reranking with a large language model. arXiv preprint arXiv:2305.02156, 2023b.
  28. Large language model is not a good few-shot information extractor, but a good reranker for hard samples! In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 10572–10601, 2023c.
  29. Docamr: Multi-sentence amr representation and evaluation. In North American Chapter of the Association for Computational Linguistics, 2021.
  30. Document ranking with a pretrained sequence-to-sequence model. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 708–718, 2020.
  31. OpenAI. ChatGPT. https://openai.com/research/chatgpt., a.
  32. OpenAI. GPT-4. https://openai.com/gpt-4, b.
  33. Rink: reader-inherited evidence reranker for table-and-text open domain question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13446–13456, 2023.
  34. Squeezing water from a stone: a bag of tricks for further improving cross-encoder effectiveness for reranking. In European Conference on Information Retrieval, pages 655–670. Springer, 2022.
  35. Improving the domain adaptation of retrieval augmented generation (rag) models for open domain question answering. Transactions of the Association for Computational Linguistics, 11:1–17, 2023.
  36. Is chatgpt good at search? investigating large language models as re-ranking agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14918–14937, 2023.
  37. Can chatgpt replace traditional kbqa models? an in-depth analysis of the question answering performance of the gpt llm family. In International Semantic Web Conference, pages 348–367. Springer, 2023.
  38. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  39. Graph attention networks. In International Conference on Learning Representations, 2018.
  40. The TREC-8 question answering track. In M. Gavrilidou, G. Carayannis, S. Markantonatou, S. Piperidis, and G. Stainhauer, editors, Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, May 2000. European Language Resources Association (ELRA). URL http://www.lrec-conf.org/proceedings/lrec2000/pdf/26.pdf.
  41. Evaluating open question answering evaluation. arXiv preprint arXiv:2305.12421, 2023a.
  42. Exploiting Abstract Meaning Representation for open-domain question answering. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, pages 2083–2096, Toronto, Canada, July 2023b. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.131. URL https://aclanthology.org/2023.findings-acl.131.
  43. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771, 2019. URL https://arxiv.org/abs/1910.03771.
  44. C-pack: Packaged resources to advance general chinese embedding, 2023.
  45. How powerful are graph neural networks? In International Conference on Learning Representations, 2018.
  46. KG-FiD: Infusing knowledge graph in fusion-in-decoder for open-domain question answering. In ACL, pages 4961–4974, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.340. URL https://aclanthology.org/2022.acl-long.340.
  47. Rankt5: Fine-tuning t5 for text ranking with ranking losses. In SIGIR, pages 2308–2313, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jialin Dong (9 papers)
  2. Bahare Fatemi (22 papers)
  3. Bryan Perozzi (58 papers)
  4. Lin F. Yang (86 papers)
  5. Anton Tsitsulin (29 papers)
Citations (8)