Robustness of BSA’s learned error bounds under distribution shift
Establish whether the learned per-dimension error-bound models used in BSA (linear regressions trained at preprocessing time on PCA-projected dimensions) remain effective when the vector collection undergoes distribution shifts between the data used for training and the data encountered at query time.
Sponsor
References
However, it is expensive, as a model has to be trained for every dimension in the collection, and their effectiveness has yet to be proven under distribution shifts in the collection.
— PDX: A Data Layout for Vector Similarity Search
(2503.04422 - Kuffo et al., 6 Mar 2025) in Section 2.3, The Power of Pruning