Determine the hardest qrel matrix under sign-rank-based limits
Determine, for fixed numbers of documents and top-k relevant documents per query, which binary query relevance (qrel) matrix is provably the hardest to represent for single-vector embedding models—formally, which matrices require the largest embedding dimension or maximize sign-rank—and provide a definitive theoretical proof identifying such matrices.
References
Although we could not prove the hardest qrel matrix definitively with theory (as the sign rank is notoriously hard to prove), we speculate based on intuition that our theoretical results imply that the more interconnected the qrel matrix is (e.g. dense with all combinations) the harder it would be for models to represent.
                — On the Theoretical Limitations of Embedding-Based Retrieval
                
                (2508.21038 - Weller et al., 28 Aug 2025) in Section: The LIMIT Dataset, Dataset Construction