POQD: Performance-Oriented Query Decomposer for Multi-vector retrieval (2505.19189v2)

Published 25 May 2025 in cs.IR and cs.DB

Abstract: Although Multi-Vector Retrieval (MVR) has achieved the state of the art on many information retrieval (IR) tasks, its performance highly depends on how to decompose queries into smaller pieces, say phrases or tokens. However, optimizing query decomposition for MVR performance is not end-to-end differentiable. Even worse, jointly solving this problem and training the downstream retrieval-based systems, say RAG systems could be highly inefficient. To overcome these challenges, we propose Performance-Oriented Query Decomposer (POQD), a novel query decomposition framework for MVR. POQD leverages one LLM for query decomposition and searches the optimal prompt with an LLM-based optimizer. We further propose an end-to-end training algorithm to alternatively optimize the prompt for query decomposition and the downstream models. This algorithm can achieve superior MVR performance at a reasonable training cost as our theoretical analysis suggests. POQD can be integrated seamlessly into arbitrary retrieval-based systems such as Retrieval-Augmented Generation (RAG) systems. Extensive empirical studies on representative RAG-based QA tasks show that POQD outperforms existing query decomposition strategies in both retrieval performance and end-to-end QA accuracy. POQD is available at https://github.com/PKU-SDS-lab/POQD-ICML25.

Summary

Performance-Oriented Query Decomposer for Multi-vector Retrieval

This paper presents an innovative approach to enhancing Multi-Vector Retrieval (MVR) systems through optimal query decomposition. The authors introduce the Performance-Oriented Query Decomposer (POQD), a framework that aims to improve the effectiveness of retrieval-based systems by dynamically decomposing queries into sub-queries to optimize downstream performance.

Overview

The paper underscores the significance of query decomposition in the context of MVR, particularly its application in improving retrieval accuracy for systems like Retrieval-Augmented Generation (RAG). Traditional MVR strategies, such as ColBERT, focus on token-level query decomposition which may lead to suboptimal performance due to inefficiencies in capturing the nuanced similarities between queries and documents. The POQD framework seeks to address these limitations by leveraging LLMs both as decomposers and optimizers for sub-query generation.

Theoretical Contributions

The authors have outlined a novel end-to-end training algorithm within their framework, facilitating the iterative optimization of sub-query prompts to enhance retrieval performance. This is complemented by a theoretical analysis demonstrating the convergence and efficacy of this approach under specific hyperparameter settings. The paper asserts that by appropriately configuring the number of training steps and loss reduction thresholds, their method can achieve superior results at a reasonable computational cost.

Empirical Results

Empirical evidence from extensive experiments conducted on several QA datasets, including WebQA, MultiModalQA, and ManyModalQA, indicates that POQD outperforms existing query decomposition strategies in terms of both retrieval performance and end-to-end QA accuracy. Notably, POQD achieves significant improvements in retrieval hit rates and QA exact match accuracy, showcasing its capability to systematically refine sub-queries for optimal performance.

Practical Implications

The implications of this research are multifaceted, offering advancements in the performance of retrieval-based AI systems. By enhancing query decomposition, POQD facilitates more accurate information retrieval, which is crucial for applications like search engines, digital assistants, and other AI-driven information systems. Furthermore, this methodology can be seamlessly integrated into existing systems, offering a lightweight yet effective upgrade path.

Speculation on Future Developments

This paper opens avenues for future research, particularly in refining the algorithmic aspects of LLM-based query decomposition. It suggests potential exploration into alternative LLMs and further optimizations that maintain efficiency while maximizing accuracy. Additionally, the paper hints at broader applications of performance-oriented frameworks in various AI tasks beyond retrieval, fostering ongoing innovation in AI-driven data processing.

Conclusion

The POQD framework represents a significant advancement in multi-vector retrieval through its performance-oriented approach to query decomposition. It combines theoretical robustness with practical efficacy, promising improvements in retrieval-based system performance across diverse applications. As AI continues to evolve, methodologies such as POQD are likely pivotal in advancing efficient and accurate information retrieval strategies.