Papers
Topics
Authors
Recent
Search
2000 character limit reached

Frequency-Dominant Neighborhood Structure (F-DNS)

Updated 26 February 2026
  • F-DNS is a unifying framework that captures frequent local patterns in graphs and images by analyzing frequency characteristics of neighborhoods.
  • It employs BFS-based candidate generation and degree-frequency histograms for graphs, alongside DCT-driven perceptual hashing in images.
  • Experimental results show scalable graph mining, robust invariance in image hashing, and effective node feature extraction for various ML tasks.

Frequency-Dominant Neighborhood Structure (F-DNS) represents a unifying framework for extracting and encoding dominant local patterns in both graph-structured data and images. By focusing on the frequency characteristics of local neighborhoods—whether these are graph-theoretic r-neighborhoods, neighbor-degree histograms, or spatial-frequency domains—F-DNS enables efficient pattern mining, robust feature hashing, and local-to-global inference across heterogeneous data modalities.

1. Formal Definitions and Core Mathematical Structures

Across literature, F-DNS takes distinct but conceptually related forms:

A. Graph Mining (Single-Graph Setting)

F-DNS is formalized as the set of all frequent r-neighborhood patterns in a single labeled graph G=(V,E,V,E)G=(V,E,\ell_V,\ell_E), with:

  • An r-neighborhood Gr[v]G_r[v] induced over Nr(v)={uV:distG(v,u)r}N_r(v)=\{u\in V : dist_G(v,u)\leq r\}, with edges as in GG and a designated pivot vv.
  • A neighborhood pattern NN is matched to vVv\in V if there exists an injective pivoted subgraph isomorphism f:V(N)Nr(v)f:V(N)\rightarrow N_r(v), preserving vertex and edge labels and mapping the pivot to vv.
  • The support of NN is support(N)=MG(N)/V\mathrm{support}(N) = |M_G(N)|/|V|, where MG(N)={vV:N matches v}M_G(N) = \{v\in V : N\text{ matches }v\} (Han et al., 2013).

B. Graph Embeddings via Neighbor-Degree Frequency

F-DNS is instantiated as histograms or matrices reflecting the frequencies of neighbor degrees up to a given BFS depth:

  • The (vanilla/minimal/dynamic) NDF vector encodes, for vVv\in V, the counts of immediate neighbors with various degrees, optionally binned by intervals I1,,ImI_1,\ldots,I_m for dynamic graphs.
  • Higher-order structures aggregate these frequencies at increasing BFS radii and may be normalized, forming the NDFC or CDF matrices (Shirbisheh, 2022).

C. Perceptual Hashing in Images

F-DNS constitutes a global feature vector that aggregates local dominant frequency similarity patterns:

  • The image is transformed via the 2D Discrete Cosine Transform, F(u,v)F(u,v).
  • Over each N×NN\times N window in frequency space, the dominant frequency structure is captured by computing, at each central coefficient (u0,v0)(u_0,v_0), the Euclidean distance between patches (center and neighbor) of size M×MM\times M.
  • Summing these local maps over the frequency domain and aggregating yields the F-DNS hash, a vector of dim=(N1)2\mathrm{dim}=(N-1)^2 (typically, N=9N=9, M=3M=3) (Biswas et al., 2020).

2. Algorithmic Frameworks and Computational Properties

Frequent Neighborhood Pattern Mining in Graphs

The mining of F-DNS proceeds via an Apriori-style, BFS-based enumeration:

  • Candidate Generation: Start with all small, frequent “building block” patterns—paths pivoted at one end and up to the radius bound.
  • Pattern Joining and Pruning: For each size kk, candidate patterns are generated by joining pairs of size-(k1)(k-1) frequent patterns. Candidates whose every subpattern is not frequent are discarded, in accordance with the downward-closure property (DCP): if NN is a subpattern of NN', support(N)support(N)\mathrm{support}(N')\leq \mathrm{support}(N).
  • VID-list Optimization: For each pattern, maintain the list of matching vertices (VID-list) to speed up support computation (by intersecting candidate lists). This yields up to 100×100\times speedup in join-and-verify steps (Han et al., 2013).

Local Graph Embedding and Isomorphism Testing

Feature extraction by BFS to depth rr centered at each node vv:

  • Step 1: Compute the NDF vector as degree-frequency bins over C1(v)C_1(v).
  • Step 2: For k=0..rk=0..r, compute mean neighbor-degree frequencies over the kk-th BFS “circle” (NDFC) or raw frequencies (CDF).
  • Step 3: Stack these as row vectors to construct node-specific matrices for downstream ML or isomorphism refinement.
  • Complexity: For radius rr and average degree d\overline{d}, work is O(Vrd)O(|V|\cdot r\cdot \overline{d}); all steps use adjacency lists (no matrix assembly needed) and are highly parallelizable (Shirbisheh, 2022).

Image Perceptual Hashing

The F-DNS hash algorithm consists of:

  • Preprocessing: Convert to greyscale, apply Gaussian smoothing.
  • DCT Computation: Compute F(u,v)F(u,v) over the entire preprocessed image.
  • Sliding Window Feature Extraction: For each frequency coefficient, extract central and neighboring M×MM\times M patches; compute pairwise Euclidean distances.
  • Aggregation: Sum all local F-DNS maps to produce a global signature vector (e.g., 64D for N=9N=9).
  • Similarity: Pearson correlation of F-DNS hashes is used for matching; classification is template-driven and non-parametric (Biswas et al., 2020).

3. Semantic and Theoretical Significance

A. Local-to-Global Inference in Graphs

  • F-DNS captures the “local topology” around graph vertices, summarizing how many vertices share a particular labeled, topological pattern (e.g., “authors with at least two papers,” “self-citation cycles”).
  • In single-graph settings, counting the frequency/proportion of pivots matching a local pattern provides a richer, more informative support measure than the traditional “exists/does not exist” used in subgraph mining (Han et al., 2013).

B. Isomorphism and Centrality

  • The multilevel degree-frequency histograms underlying F-DNS can distinguish many pairs of non-isomorphic graphs, in some cases where 1-WL color refinement fails (Shirbisheh, 2022).
  • Parametric centrality families derived from BFS exploration, aggregating “circle sizes” sk(v)s_k(v) with pp-exponential weights, yield features closely tracking classic measures like closeness and PageRank.

C. Perceptual Robustness in Images

  • F-DNS hashes provide invariance to content-preserving transforms, especially geometric transformations (rotation, scaling) and various noise operations.
  • The DCT basis allows for a compact separation of informative (high-energy) and less informative (low-energy) spatial components, enabling robust recognition, even across significant distortions (Biswas et al., 2020).

4. Experimental Evidence and Quantitative Results

Graph Mining (Han et al., 2013)

  • On EntityCube (V4.7|V|\approx4.7M, ΣV=288|\Sigma_V|=288, ΣE=207|\Sigma_E|=207), F-DNS mining scales efficiently using minimum support thresholds as low as 0.0001.
  • VID optimizations yield over an order-of-magnitude candidate reduction and 80% reduction in per-candidate verification time.
  • On ArnetMiner, size-4 neighborhood patterns are mined in under a minute, finding \sim1,000 significant patterns for author pivots.
  • Patterns include “author writes ≥2 papers” (support \approx31.4% of authors), “conference accepts ≥2 papers from same author” (\approx25.4%), and cyclic/co-authorship motifs (up to \approx10% of all patterns).

Graph Embeddings (Shirbisheh, 2022)

  • Flattened NDFC matrices input to shallow feed-forward neural nets achieve 90–98% accuracy in predicting PageRank and closeness centrality, with accuracy maintained under random edge perturbations and on unseen graphs.
  • No global matrix factorization or solve required; models are lightweight (4–6 layers, minutes of training).

Image Perceptual Hashing (Biswas et al., 2020)

  • On standard image and web page screenshot datasets, F-DNS achieves Pearson correlation ρ>0.98\rho>0.98 under all perturbations except rotation (ρ=0.9365\rho=0.9365), outperforming RP-IVD (ρ=0.2959\rho=0.2959 under rotation).
  • On DUSI-2K (2,500 Tor screenshots, 16 categories), F-DNS hashing with a template-based classifier achieves 98.75% accuracy, exceeding RP-IVD (95.84%) and Inception-ResNet-v2 (85.19%).

5. Strengths, Limitations, and Extensions

Strengths

  • F-DNS encodes local structural regularities that are highly informative for tasks ranging from graph pattern mining and node embedding to robust perceptual hashing.
  • The support measure in F-DNS preserves the DCP, enabling efficient candidate pruning and scalable algorithms.
  • In perceptual hashing, the resulting features maintain high discrimination with low dimensionality (e.g., 64 floats) and enable non-parametric classification without extensive training (Biswas et al., 2020).
  • NDF-based embeddings provide transferrable, inductive node features suitable for dynamic and evolving graphs, requiring only local exploration (Shirbisheh, 2022).

Limitations

  • In hashing, real-valued F-DNS descriptors necessitate floating-point storage and matching; binary quantization (not attempted in (Biswas et al., 2020)) could offer further compactness and speed.
  • For large images, the O(PQN2M2)O(PQN^2M^2) sliding window computation, while linear, can be computationally intensive—multi-resolution analysis or keypoint prioritization could address this (Biswas et al., 2020).
  • In graph mining, worst-case exponential isomorphism checks may be required but are heavily mitigated by locality and pruning (Han et al., 2013).

Potential Extensions

  • Binarization and use of locality-sensitive hashing for rapid nearest neighbor search or database indexing in perceptual applications.
  • Substitution of the DCT with other frequency decompositions (e.g., wavelets) for domains where local stationarity does not hold (Biswas et al., 2020).
  • Restricting F-DNS computation to salient keypoints or high-distinctiveness regions in images to improve computational efficiency.

6. Applications and Broader Impact

F-DNS underpins a variety of practical and theoretical advancements:

  • Graph Pattern Discovery: Enables the mining of frequent, interpretable motifs in knowledge graphs and citation networks (e.g., self-citation cycles, author-venue reuse) with direct semantic interpretation (Han et al., 2013).
  • Graph Isomorphism and Node Feature Learning: Supplies a suite of local descriptors for isomorphism testing and accurate regression/classification of node-level graph-theoretic quantities using simple ML models (Shirbisheh, 2022).
  • Image Similarity and Classification: Provides a robust, template-driven mechanism for classification of web screenshots, including obfuscated or variably rendered Tor domains, with state-of-the-art invariance to preservation edits (Biswas et al., 2020).

A plausible implication is that the local, frequency-dominant perspective—whether via BFS-driven neighborhood statistics or frequency-domain analysis—captures the essence of recurring structure across disparate data types, rendering F-DNS a foundational concept for feature extraction, pattern recognition, and efficient large-scale data mining.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Frequency-Dominant Neighborhood Structure (F-DNS).