Papers
Topics
Authors
Recent
Search
2000 character limit reached

FusedANN: Hybrid Filtered ANNS Methods

Updated 26 March 2026
  • FusedANN is a hybrid approach that integrates filtering constraints into ANN search, effectively addressing low-selectivity challenges in high-dimensional spaces.
  • It combines filter-then-search, search-then-filter, and hybrid indexing paradigms to enhance recall and query speed under stringent filtering conditions.
  • The method leverages state-of-the-art index structures and dynamic parameter tuning to optimize performance in semantic retrieval and vector database applications.

Low-selectivity filtered Approximate Nearest Neighbor Search (ANNS) refers to the task of finding the top-kk nearest vectors, by some distance metric, to a query vector, subject to additional filtering constraints (such as labels, ranges, or complex predicates), in the regime where the filter selectivity σ\sigma—the fraction of database points passing the predicate—is very small, typically s1%s \leq 1\% or even s1%s \ll 1\%. This regime poses unique algorithmic and systems challenges due to the sparsity and fragmentation of qualifying candidates in high-dimensional spaces. The development of effective and efficient methods for low-selectivity filtered ANNS is of central importance for semantic retrieval, retrieval-augmented generation (RAG), and vector database systems with structured or metadata-based constraints.

1. Formalization and Selectivity Metrics

Let SS be a dataset of nn vectors in Rd\mathbb{R}^d, each accompanied by structured metadata AxA_x (labels, attributes, timestamps, etc.). A filter predicate σ\sigma defines a subset Pσ={xSx satisfies σ}P_\sigma = \{x \in S \mid x \text{ satisfies } \sigma\}, with selectivity s=Pσ/ns = |P_\sigma| / n (Jin et al., 3 Jan 2026, Shi et al., 9 Sep 2025, Amanbayev et al., 11 Feb 2026). A low-selectivity filtered-ANNS query thus seeks:

  • Top-kk nearest neighbors to a query vector qq, restricted to xPσx \in P_\sigma.

Selectivity notation:

  • Global selectivity σg=Pσ/n\sigma_g = |P_\sigma|/n
  • Local selectivity for query qq, σ={valid in Nq}/k\sigma_\ell = |\{\text{valid in } \mathcal{N}_q\}| / k, with Nq\mathcal{N}_q the true kk-nearest neighbors of qq in SS
  • GLS correlation per query: ρq=(σ/σg1)/(σ/σg+1)[1,1)\rho_q = (\sigma_\ell/\sigma_g - 1)/(\sigma_\ell/\sigma_g + 1) \in [-1,1) (Amanbayev et al., 11 Feb 2026)

In the low-selectivity regime (s1s \ll 1), the expected candidate set after pure pre-filtering is sns n, and for search/filter hybrids, typically lkl \gg k raw candidates are retrieved with lσkl \sigma \gtrsim k (Shi et al., 9 Sep 2025).

2. Algorithmic Paradigms and Structures

A comprehensive taxonomy of filtered ANNS algorithms distinguishes three main paradigms according to the interplay of index construction and filtering (Shi et al., 9 Sep 2025, Li et al., 22 Aug 2025):

  1. Filter-Then-Search: Explicitly selects the subset PσP_\sigma before running ANNS. Example: UNG (Unified Navigating Graph) materializes a pre-filtered candidate set then applies graph search; ACORN with pre-filtering (Shi et al., 9 Sep 2025).
  2. Search-Then-Filter: Runs standard ANNS on the whole dataset, then filters results post-hoc. Example: HNSW, IVFPQ with post-filtering (Shi et al., 9 Sep 2025). Performance degrades significantly as s0s \to 0 due to insufficient valid candidates, requiring lk/σl \approx k / \sigma.
  3. Hybrid/Integrated Filtering: Integrates filter awareness into indexing and search. Examples:
    • Stitched and Filtered-DiskANN: Build Vamana graphs with label-limited edges or stitched subgraphs.
    • JAG (Joint Attribute Graphs): Builds proximity graphs with continuous filter/attribute distances for guidance, robust to arbitrary ss and diverse predicates (Xu et al., 10 Feb 2026).
    • Curator: Hierarchical partition-based index with embedded per-label/predicate buffers and Bloom filters for precise and efficient filtering at low selectivity (Jin et al., 3 Jan 2026).

These strategies contrast in their cost scaling, robustness, and recall under diminishing ss.

3. Breakdown of Classical Structures under Low Selectivity

Graph-Based Indexes

Graph indexes (HNSW, DiskANN, ACORN) rely on high local connectivity. As ss decreases, the induced subgraph on PσP_\sigma fragments, leading to many qualifying vectors being unreachable by traversal (Jin et al., 3 Jan 2026, Amanbayev et al., 11 Feb 2026, Li et al., 22 Aug 2025). Remedying this by increasing average degree Mm/sM \to m/s becomes infeasible (O(nm/s)O(n m/s) construction and O(nm/s)O(n m/s) memory), and even specialized segmentation approaches (e.g., edge covering, multi-entry points) fail below s0.1%1%s \approx 0.1\%-1\% (Li et al., 22 Aug 2025).

Partitioned and Inverted File Structures

Partition-based (IVFFlat, IVFPQ) indexes directly support pre-filtering: cluster selection or inverted lists can be efficiently intersected with filter results. Empirically, IVFPQ/IVFFlat maintain robust query latency and recall at s1%s \ll 1\% where graph-based methods collapse (Amanbayev et al., 11 Feb 2026, Li et al., 22 Aug 2025).

Hashing/LSH-Based Indexes

Hashing approaches such as Falconn++ implement low-selectivity by aggressive bucket filtering. Falconn++ applies a projection-based filter F(x)=[rixt]F(x) = [r_{i^*} \cdot x \geq t] per bucket, keeping only an α\alpha-fraction of points, substantially reducing candidate pool size, with provable reduction in query time exponent ρ<ρ\rho' < \rho (Pham et al., 2022). This enables scaling to much lower selectivity than classical LSH.

Tree-Based Partitioning

Curator constructs a global hierarchical kk-means tree indexing all data, embedding per-label/predicate subindexes via buffers/Bloom filters to support low-selectivity filtering with minimal memory and update overhead (Jin et al., 3 Jan 2026). Tree expansions are sharply bounded in practice, and construction cost is O(ndlogC(n/L))O(n d \log_C(n/L)).

4. Filter Types, Cost Models, and Empirical Benchmarks

Supported Filter Types

Advanced methods address various filter types:

  • Equality (label/attribute)
  • Range (numerical, e.g., date intervals)
  • Subset/Containment (multi-label, tag inclusion)
  • Boolean/Complex predicates

JAG is notable for transforming each binary filter g(a,f)g(a,f) into a continuous filter distance dF(a,f)d_F(a,f), enabling lexicographically guided search and connectivity smoothing across low-selectivity regimes (Xu et al., 10 Feb 2026).

Cost and Recall Scaling

Method Query Time Scaling at Low ss Recall Performance Notes
HNSW post-filter O(sδ),δ1O(s^{-\delta}), \delta \approx 1 Recall@10 collapses for s1%s \ll 1\% Graph disconnects
IVFPQ/IVFFlat O(ns)O(n s) Recall remains stable at s1%s\ll1\% Partition pruning
Curator O((1/s)θ)O((1/s)^\theta), θ<1\theta<1 QPS 20×\sim 20 \times baseline at s=0.1%s=0.1\% Hier. partition+buffers
JAG O(logn)O(\log n) hops via multi-threshold edges Recall >0.95>0.95 at s<103s<10^{-3} Filter-agnostic
Falconn++ O(dnρ)O(d n^{\rho'}), ρ<ρ\rho' < \rho Empirically 310×3-10\times faster than Falconn LSH/filtered-bucket

Empirical studies confirm that:

5. State-of-the-Art Methods: Constructions and Innovations

Curator

Curator's dual-index combines a global tree (hierarchical kk-means), per-label/predicate leaf buffers, and Bloom filters at each node. Queries traverse only nodes likely to contain qualifying candidates. For complex predicates, it constructs a temporary subindex mirroring the global tree structure. Curator achieves up to 20.9×20.9\times query speedup at s=0.001s=0.001 with only 5.5%5.5\% build time and 4.3%4.3\% memory overhead (Jin et al., 3 Jan 2026).

JAG

JAG generalizes graph-based methods by introducing filter/attribute distances and constructing multi-threshold proximity graphs. At query time, a lexicographic (dF,dist)(d_F, \mathrm{dist}) ordering provides continuous search guidance, preventing dead-ends and unifying support for label, range, subset, and Boolean constraints. JAG is the first filter-agnostic proximity graph with empirical robustness across all ss and filter types (Xu et al., 10 Feb 2026).

Falconn++

Falconn++ applies a low-selectivity filter within hash buckets by thresholding on the main projection coordinate, filtering bucket contents from BB down to αB\alpha B points and trading candidate quantity for query time and recall in a controlled way. With carefully chosen parameters (α0.01\alpha\approx0.01–$0.1$), Falconn++ exhibits 3–10×\times speedup over Falconn and matches or outperforms HNSW at high recall (Pham et al., 2022).

6. Practical Guidelines and System Integration

Findings across recent studies yield several practical rules (Amanbayev et al., 11 Feb 2026, Shi et al., 9 Sep 2025):

  • Select index by expected selectivity: For σg5%\sigma_g \lesssim 5\%, partition/inverted-file methods or hybrid trees are preferred; for σg20%\sigma_g \gtrsim 20\%, graph-based methods are advantageous.
  • Parameterization: Graph degree, number of entry points, nproben_\mathrm{probe}, and ef-search must be judiciously tuned based on ss and kk. In low-ss regimes, exact scan fallback may be optimal, especially if candidates are few.
  • System adaptations: Milvus employs hybrid execution and dual-priority queues for robust recall; pgvector in PostgreSQL can expose exact kNN plans via B-tree scans on filter columns, avoiding recall cliffs from post-filtering (Amanbayev et al., 11 Feb 2026).
  • Empirical cost modeling: For graph methods, Tgraph(s)T_\mathrm{graph}(s) degrades super-linearly as s0s\to0; for partition/IVF, T(s)O(ns)T(s) \approx O(n s).
  • Robustness to filter type: Only hybrid/integrated and filter-agnostic methods such as JAG and Curator exhibit uniform throughput and recall as s0s \to 0 regardless of filter complexity.

7. Open Challenges and Future Directions

Despite recent algorithmic progress, several open challenges remain (Li et al., 22 Aug 2025, Shi et al., 9 Sep 2025, Jin et al., 3 Jan 2026):

  • Dynamic index maintenance: Supporting efficient insertions, deletions, and updates across arbitrary filter predicates remains nontrivial, especially for graph and partition-based indices.
  • Auto-tuning: Determining optimal index and search/hyperparameters online as ss changes per query/dataset is an open engineering problem.
  • Universal filter-robust indices: While JAG demonstrates filter-agnostic robustness, generalizing these insights to support arbitrary, evolving predicate classes at scale is an ongoing research area.
  • Cost modeling and query planning: Accurate, closed-form models for T(s)T(s) are needed for query optimizers in production vector databases and hybrid RAG pipelines.
  • Recall-latency tradeoff diagnostics: The GLS metric enables per-query analysis of recall loss risks, but integrating such diagnostics into production systems is only just emerging (Amanbayev et al., 11 Feb 2026).

In summary, low-selectivity filtered ANNS research has progressed from specialized and brittle solutions to robust, filter-agnostic hybrid and partition-based indices capable of maintaining high throughput and recall even at σ1%\sigma \ll 1\%. Continued advances in theory, algorithm design, and system integration are essential for fully general, high-performance retrieval under arbitrary structured filtering constraints.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FusedANN.