Papers
Topics
Authors
Recent
Search
2000 character limit reached

Proximity Graphs with Filter Support

Updated 6 April 2026
  • Proximity graphs with filter support are advanced data structures that embed user-defined filters directly into graph indexing for constraint-driven similarity search.
  • Key methodologies include filter-agnostic online filtering, segment-tree based multi-graph approaches, and joint attribute graphs that combine vector and filter distances.
  • Empirical evaluations show these approaches offer significant efficiency gains in range-filtered ANN and durable pattern mining, achieving robust performance under varied constraints.

A proximity graph with filter support is a graph-based data structure or index designed to efficiently answer similarity or pattern queries under user-specified constraints or filters. These filters often operate over attribute metadata, lifespans, or arbitrary user predicates, and are integrated directly into the graph construction or traversal algorithms. This approach enables single-stage search processes for tasks such as attribute-constrained nearest neighbor search, range-filtered similarity search, temporal pattern mining under durability constraints, and robust support for arbitrary filter types and selectivities. The following sections review the principal methodologies and theoretical innovations underpinning proximity graphs with filter support, representative solution frameworks, analytical properties, and empirical findings in the literature.

1. Formal Problem Models and Taxonomy

The general proximity graph with filter support problem can be formalized as follows:

  • Given a dataset of vectors {vi}⊂Rd\{v_i\} \subset \mathbb R^d and corresponding attribute tuples (ai1,…,aim)(a_{i1},\ldots,a_{im}) or temporal labels,
  • Given a dissimilarity function dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot) (commonly Euclidean or cosine),
  • Given a filter or predicate f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}, often with structural or semantic domain,
  • Find the top KK nearest neighbors to a query qq among those vv satisfying f(v)=truef(v)=\mathrm{true}, or enumerate subgraphs that satisfy additional filter conditions (such as temporal durability or aggregate attribute coverage) (Zhao et al., 2022, Agarwal et al., 2024, Xu et al., 2024, Xu et al., 10 Feb 2026).

These filter types include:

  • Arbitrary user predicates: as in AIRSHIP, where ff is a black-box function (Zhao et al., 2022).
  • Numeric attribute ranges: e.g., RFANN queries as [al,ar][a_l, a_r] (Xu et al., 2024).
  • Equality, Boolean, and subset-based filters: e.g., label, tag or attribute conjunction filters (Xu et al., 10 Feb 2026).
  • Temporal durability constraints: where both structure and persistence (e.g., intersection of lifespans) define valid subgraphs (Agarwal et al., 2024).

A table of prominent scenarios:

Task Type Filter Type Key Reference
Constrained ANN Boolean/user predicate (Zhao et al., 2022)
Range-filtered ANN Numeric range (Xu et al., 2024)
Robust filtered ANN (multi-type) Arbitrary (Xu et al., 10 Feb 2026)
Durable pattern mining Temporal/lifespan (Agarwal et al., 2024)

2. Graph Structures: Index Construction and Attribute Integration

Three primary strategies have emerged to couple proximity graph construction with filter-awareness:

2.1 Filter-Agnostic Graph with Online Filtering

Classical proximity graphs (e.g., HNSW or Vamana) are constructed ignoring filters; filters are enforced only during query traversal or as a post-processing step. This allows arbitrary filters without index change but incurs efficiency and recall penalties under selective filters (Zhao et al., 2022).

2.2 Multi-Graph or Segment-Tree Approaches

For range or categorical filters, a collection of "elemental" proximity graphs is precomputed—each indexing a subset (segment) of the dataset according to the attribute's value, such as segment tree partitions for contiguous ranges. At query time, a valid subgraph is dynamically assembled by on-the-fly union or traversal of relevant segment graphs. Space complexity is (ai1,…,aim)(a_{i1},\ldots,a_{im})0 for degree (ai1,…,aim)(a_{i1},\ldots,a_{im})1 (Xu et al., 2024).

2.3 Joint Attribute Graphs (JAG) Framework

Attributes are mapped to continuous "attribute distances" and "filter distances," producing a unified structure. At each construction layer, the typical vector distance is combined lexicographically with capped attribute distances under a set of thresholds (ai1,…,aim)(a_{i1},\ldots,a_{im})2. Edges are allocated and pruned such that the resulting index remains robust to different filter selectivities and types (Xu et al., 10 Feb 2026).

In the temporal/durable graph case, filter support is enabled structurally through interval trees and cover trees/quadtree indexing over both vector and temporal/lifespan axes (Agarwal et al., 2024).

3. Filter-Aware Query Algorithms

Efficient query processing in proximity graphs with filter support leverages either explicit or implicit filter integration:

3.1 AIRSHIP: Constrained Search with User-Defined Functions

Search begins from sampled "seed" points known to satisfy (ai1,…,aim)(a_{i1},\ldots,a_{im})3 (via an (ai1,…,aim)(a_{i1},\ldots,a_{im})4-size sample, (ai1,…,aim)(a_{i1},\ldots,a_{im})5) and employs a two-directional traversal using two priority queues: one for filter-satisfied nodes and one for others. The (ai1,…,aim)(a_{i1},\ldots,a_{im})6-fractional heuristic enforces a balance between exploitation within filter-satisfying clusters and exploration into yet-unsatisfied neighborhoods. Nodes are added to result heaps only if (ai1,…,aim)(a_{i1},\ldots,a_{im})7 (Zhao et al., 2022).

3.2 iRangeGraph: Dynamic Range-Constrained Traversal

For RFANN, a search is initialized from the median rank of the range-filtered interval. Edges for each node are selected on-the-fly from the (ai1,…,aim)(a_{i1},\ldots,a_{im})8 elemental graphs covering the relevant range. The beam search is performed only over nodes within (ai1,…,aim)(a_{i1},\ldots,a_{im})9, with expansion determined by the segment tree (Xu et al., 2024).

3.3 JAG: Unified Greedy-Search Over Attribute and Filter Distances

JAG applies a greedy beam search over a single graph. At query time, candidate neighbors are ranked lexicographically by dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)0, ensuring traversal is guided toward filter-satisfying regions while maintaining vector similarity. Because edge selection at index construction covers all relevant attribute thresholds, no dead-ends arise for any filter type or sparsity (Xu et al., 10 Feb 2026).

3.4 Enumeration of Durable Patterns

Temporal proximity graphs support queries for dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)1-durable patterns (triangles, paths, etc.) via near-linear time algorithms, using interval trees for lifespan overlap and cover/quadtree structures for proximity (Agarwal et al., 2024). Incremental data structures enable reporting only new patterns as durability thresholds are changed interactively.

4. Complexity, Robustness, and Empirical Evaluation

4.1 Complexity

  • AIRSHIP: For sample size dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)2 and dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)3 visited nodes, dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)4 query time, usually with dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)5 (Zhao et al., 2022).
  • iRangeGraph: Space and build time dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)6; per-node edge-selection in dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)7, beam search of dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)8 candidates for top-dist(â‹…,â‹…)\mathrm{dist}(\cdot, \cdot)9 queries (Xu et al., 2024).
  • JAG: Index build time f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}0 (f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}1beam size, f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}2degree, f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}3#thresholds), query time f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}4 (Xu et al., 10 Feb 2026).
  • Durable patterns: Preprocessing f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}5, pattern enumeration and update cost f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}6 (Agarwal et al., 2024).

4.2 Robustness Across Filter Types

  • JAG achieves recall and throughput that remain robust to filter type, selectivity, and correlation with embedding similarity. Attribute and filter distances ensure the graph is navigable under arbitrary constraints (Xu et al., 10 Feb 2026).
  • iRangeGraph matches "oracle" (pre-materialized) approaches in recall and QPS, but with feasible space and construction cost (Xu et al., 2024).
  • AIRSHIP demonstrates throughput improvements of f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}7–f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}8 over post-filtered HNSW for f:V→{true,false}f: V \to \{\mathrm{true}, \mathrm{false}\}9 selectivities (Zhao et al., 2022).
  • Durable pattern enumeration scales linearly on large (KK0) proximity graphs (Agarwal et al., 2024).

Empirical evaluations consistently focus on large-scale benchmarks such as SIFT1M, MNIST, LAION, YFCC10M, and web-scale retrieval contexts (Zhao et al., 2022, Xu et al., 2024, Xu et al., 10 Feb 2026).

5. Optimization Strategies and Parameterization

Key algorithmic strategies and their parameter considerations include:

  • Seed sampling: For filters with retention KK1, sample KK2 to have about KK3 satisfied seeds (AIRSHIP) (Zhao et al., 2022).
  • Mixing ratio KK4: Adaptive KK5-balancing in AIRSHIP for optimal trade-off between cluster exploitation and exploration (Zhao et al., 2022).
  • Segment tree design: Balancing segment size and depth determines iRangeGraph's index size and edge-redundancy (Xu et al., 2024).
  • Joint threshold selection: Multiple attribute thresholds in JAG's construction (KK6) enable consistent performance across selectivities; 3-4 thresholds empirically suffice (Xu et al., 10 Feb 2026).
  • Beam and degree parameters: Search and build beam sizes (KK7), and graph degrees (KK8) define the QPS/recall/space trade-off envelope (Zhao et al., 2022, Xu et al., 2024, Xu et al., 10 Feb 2026).

Recommended settings (for KK9 or moderate selectivity): qq0 (graph degree), qq1 (sample size), qq2 matched to graph neighbor statistics, qq3–qq4 (degree in iRangeGraph), qq5–qq6 (thresholds in JAG).

6. Extensions and Applications

Proximity graphs with filter support generalize to:

  • Multi-attribute filtered search (e.g., combination of range and categorical attributes), with probabilistic edge sampling and query-guided neighbor selection (Xu et al., 2024, Xu et al., 10 Feb 2026).
  • Temporal networks and mining of resilient or persistent structures, enabled by combining proximity and interval/durability constraints, supporting interactive pattern analytics (Agarwal et al., 2024).
  • Robust integration in retrieval and recommendation systems where filters range from simple tags to arbitrary complex Boolean logic (Zhao et al., 2022, Xu et al., 10 Feb 2026).

A plausible implication is that filter-supporting proximity graphs increasingly serve as the backbone for large-scale vector search systems that must efficiently answer filtered queries at web scale, without precomputing dedicated indices for every possible filter configuration.

7. Comparative Summary

A synthesis of modern approaches:

Method Filter Support Index Structure Complexity Recall/QPS Robustness
AIRSHIP Arbitrary, arbitrary Single proximity graph + seeds qq7 High (qq8), scalable
iRangeGraph Numeric range qq9 segment graphs vv0 Near-oracle with vv1
JAG Arbitrary (label, range, subset, Boolean) Single graph with attribute distances vv2 Uniform; outperforms state-of-art
Durable Graph Temporal/durability Cover tree + interval trees vv3 Scalable to millions; exact in vv4

These architectures collectively establish the state-of-the-art in filter-aware similarity and pattern search, demonstrating empirical and theoretical performance nearly matching filter-specialized or filter-naive oracular baselines, but at practical space and compute budgets. For full constructions, algorithms, and analytical proofs, see the cited works (Zhao et al., 2022, Agarwal et al., 2024, Xu et al., 2024, Xu et al., 10 Feb 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Proximity Graphs with Filter Support.