Robustness of BSA’s learned error bounds under distribution shift
Establish whether the learned per-dimension error-bound models used in BSA (linear regressions trained at preprocessing time on PCA-projected dimensions) remain effective when the vector collection undergoes distribution shifts between the data used for training and the data encountered at query time.
References
However, it is expensive, as a model has to be trained for every dimension in the collection, and their effectiveness has yet to be proven under distribution shifts in the collection.
                — PDX: A Data Layout for Vector Similarity Search
                
                (2503.04422 - Kuffo et al., 6 Mar 2025) in Section 2.3, The Power of Pruning