2000 character limit reached
Knowledge-Aware Diverse Reranking for Cross-Source Question Answering (2506.20476v1)
Published 25 Jun 2025 in cs.CL and cs.IR
Abstract: This paper presents Team Marikarp's solution for the SIGIR 2025 LiveRAG competition. The competition's evaluation set, automatically generated by DataMorgana from internet corpora, encompassed a wide range of target topics, question types, question formulations, audience types, and knowledge organization methods. It offered a fair evaluation of retrieving question-relevant supporting documents from a 15M documents subset of the FineWeb corpus. Our proposed knowledge-aware diverse reranking RAG pipeline achieved first place in the competition.