Characterize which top-k combination patterns fail for single-vector embeddings
Characterize the classes of top-k document set combinations—equivalently, structural patterns in the binary query relevance matrix—that single-vector embedding models provably fail to represent, by identifying the specific properties that lead to unavoidable failure regardless of training.
References
We have showed the theoretical connection that proves that some combinations cannot be represented by embedding models, however, we cannot prove apriori which types of combinations they will fail on.
                — On the Theoretical Limitations of Embedding-Based Retrieval
                
                (2508.21038 - Weller et al., 28 Aug 2025) in Limitations