Papers
Topics
Authors
Recent
2000 character limit reached

Centroid–Residual Quantization in ANN Search

Updated 30 December 2025
  • Centroid–Residual Quantization is a hierarchical vector quantization method that decomposes a data vector into a coarse centroid and a quantized residual.
  • An enhanced variant, Transformed Residual Quantization, applies per-cluster orthogonal transformations to align residuals and reduce quantization error by up to 50% in some cases.
  • This approach significantly improves approximate nearest neighbor search performance by reducing storage requirements and computational complexity compared to traditional methods.

Centroid–Residual Quantization, often referred to as Residual Quantization (RQ), is a hierarchical vector quantization strategy that approximates a data vector as the sum of a coarse centroid and a quantized residual. This two-stage quantizer has found substantial application in large-scale approximate nearest neighbor (ANN) search, where it achieves efficient representational compression and reduced computational complexity. An enhanced variant, Transformed Residual Quantization (TRQ), introduces per-cluster linear transformations—restricted to orthogonal matrices—to further align residuals, thereby minimizing quantization error and improving retrieval performance. Both models provide natural extensions and direct replacements for Product Quantization (PQ), yielding exponential complexity reductions in codebook size for both storage and computation (Yuan et al., 2015).

1. Formal Structure of Two-Stage Residual Quantization

Given a dataset X={xi}i=1NRDX = \{x_i\}_{i=1}^N \subset \mathbb{R}^D, RQ first partitions the data using a coarse codebook C(1)={c1(1),,cK1(1)}RDC^{(1)} = \{c^{(1)}_1, \ldots, c^{(1)}_{K_1}\} \subset \mathbb{R}^D. Each vector xix_i is assigned to its nearest centroid via

k(i)=argmink=1K1xick(1)22,k(i) = \arg\min_{k=1\ldots K_1} \|x_i - c^{(1)}_k\|_2^2,

with the first-stage reproduction q1(xi)=ck(i)(1)q_1(x_i) = c^{(1)}_{k(i)}. The residual vector is defined as ri=xick(i)(1)r_i = x_i - c^{(1)}_{k(i)}. The second-stage codebook C(2)={c1(2),,cK2(2)}C^{(2)} = \{c^{(2)}_1, \ldots, c^{(2)}_{K_2}\} is learned by applying k-means clustering to the collection of residuals {ri}\{r_i\}. Each residual rir_i is then assigned:

l(i)=argmin=1K2ric(2)22,q2(ri)=cl(i)(2).l(i) = \arg\min_{\ell=1\ldots K_2} \|r_i - c^{(2)}_\ell\|_2^2, \quad q_2(r_i) = c^{(2)}_{l(i)}.

The complete two-stage quantizer reconstructs

Q(xi)=ck(i)(1)+cl(i)(2),Q(x_i) = c^{(1)}_{k(i)} + c^{(2)}_{l(i)},

minimizing the mean squared error (MSE)

MSERQ=1Ni=1NxiQ(xi)22=1Ni=1Nricl(i)(2)22.\mathrm{MSE}_{\mathrm{RQ}} = \frac{1}{N} \sum_{i=1}^N \|x_i - Q(x_i)\|_2^2 = \frac{1}{N} \sum_{i=1}^N \|r_i - c^{(2)}_{l(i)}\|_2^2.

2. Transformed Residual Quantization: Objective and Model Enhancement

In ordinary RQ, the residuals from each first-stage cluster generally exhibit heterogeneous orientations and scales. TRQ addresses this by learning a cluster-specific orthogonal transformation TkT_k for each residual cluster. The representation becomes

QTRQ(xi)=ck(i)(1)+Tk(i)ri+cl(i)(2),Q_{\mathrm{TRQ}}(x_i) = c^{(1)}_{k(i)} + T_{k(i)} r_i + c^{(2)}_{l(i)},

with each Tk(i)T_{k(i)} constrained to be orthogonal: TkTk=IT_k^\top T_k = I for all kk.

The joint minimization objective for the first-stage codebook C(1)C^{(1)}, the second-stage codebook C(2)C^{(2)}, and the transforms {Tk}\{T_k\} is

minC(1),C(2),{Tk}i=1Nxick(i)(1)Tk(i)(xick(i)(1))cl(i)(2)22,\min_{C^{(1)}, C^{(2)}, \{T_k\}} \sum_{i=1}^N \|x_i - c^{(1)}_{k(i)} - T_{k(i)}(x_i - c^{(1)}_{k(i)}) - c^{(2)}_{l(i)} \|_2^2,

subject to TkTk=IT_k^\top T_k = I for each kk.

3. Alternating Optimization and Training Procedure

TRQ optimization employs block-coordinate descent with two alternating steps:

a. Codebook Update:

Fix the transformations {Tk}\{T_k\} and update C(2)C^{(2)} and assignments {l(i)}\{l(i)\}. For each residual cluster kk, one computes the transformed residuals Vk={Tkri:k(i)=k}V_k' = \{T_k r_i : k(i) = k\}, pools across clusters, and applies k-means (or a product quantizer) to obtain C(2)C^{(2)}. Residuals are assigned to their nearest second-stage centroids.

b. Transform Update:

Fix C(2)C^{(2)} and the assignments, and update each TkT_k by solving an orthogonal Procrustes problem. Let VkV_k be the D×VkD \times |V_k| matrix of cluster kk's residuals and W^k\widehat W_k their corresponding second-stage reconstructions. The update is:

Tk=argminΩΩ=IΩVkW^kF2T_k = \arg\min_{\Omega^\top \Omega = I} \|\Omega V_k - \widehat W_k\|_F^2

This is solved via SVD of the cross-covariance Mk=W^kVk=UkΣkVkM_k = \widehat W_k V_k^\top = U_k \Sigma_k V_k^\top, giving Tk=VkUkT_k = V_k U_k^\top.

A few dozen iterations typically suffice for convergence in practice (Yuan et al., 2015).

4. Quantization Error and Empirical Results

Quantization error in TRQ and its predecessors is measured via MSE:

  • Ordinary RQ:

MSERQ=1Niricl(i)(2)2\mathrm{MSE}_{\mathrm{RQ}} = \frac{1}{N} \sum_i \| r_i - c^{(2)}_{l(i)} \|^2

  • TRQ:

MSETRQ=1NiTk(i)ricl(i)(2)2\mathrm{MSE}_{\mathrm{TRQ}} = \frac{1}{N} \sum_i \| T_{k(i)} r_i - c^{(2)}_{l(i)} \|^2

where orthogonality of TkT_k ensures ric=TkriTkc\|r_i - c\| = \|T_k r_i - T_k c\| but with markedly improved codebook alignment.

Empirical results indicate substantial error reductions for TRQ versus optimized product quantization (OPQ): on SIFT1M, MSE is reduced by approximately 25%, and on MNIST reductions reach up to 50%. GIST1M, whose features are already close to isotropic, sees a milder MSE improvement of around 10% (Yuan et al., 2015).

TRQ demonstrates significant performance enhancements in ANN search, particularly with inverted-index search frameworks. Because query-time cost is dominated by the number of first-stage cells visited, the additional cost of evaluating a small number of D×DD \times D orthogonal projections (one per active cluster) is minimal.

On the SIFT1B benchmark with 16-byte codes and recall@1 measured at shortlist sizes T=10,000T = 10,000 and T=30,000T = 30,000, results included:

  • OPQ: R@1 = 0.359 (T = 10,000)
  • TRQ: R@1 = 0.426 (+8%)
  • OPQ: R@1 = 0.379 (T = 30,000)
  • TRQ: R@1 = 0.446 (+7%)

Similar gains appear for recall@10 and recall@50. On medium-scale datasets (SIFT1M, GIST1M, MNIST), TRQ increases Recall@1 by 5–10 percentage points over OPQ and by 7–12 points over vanilla PQ (Yuan et al., 2015).

6. Comparative Significance and Interpretations

The critical advance of TRQ is the explicit per-cluster alignment of residual distributions prior to the second-stage quantization, realized via orthogonal transformations that enable more effective codebook partitioning. This produces substantially lower quantization error and commensurate improvements in ANN search recall, especially where residuals are anisotropic across clusters.

The improvement magnitude is contingent on the structure of residual spaces: greater diversity in residual orientation or scale across clusters favors TRQ. On datasets where intrinsic isotropy is high, impact is less pronounced. A plausible implication is that further gains may be possible by hybridizing with or extending other vector quantization techniques, particularly in highly structured data environments (Yuan et al., 2015).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Centroid–Residual Quantization.