Resonant Sparse Geometry Networks (RSGN)

Updated 27 January 2026

Resonant Sparse Geometry Networks (RSGN) are brain-inspired models defined by dynamically sparse connectivity and hyperbolic embeddings that capture hierarchical structure.
They leverage two adaptation timescales with fast gradient-driven activation propagation and slow, Hebbian-like plasticity to optimize connectivity.
RSGN achieve competitive performance with significantly fewer parameters than dense models, demonstrating efficiency in hierarchical and long-range dependency tasks.

Resonant Sparse Geometry Networks (RSGN) are a brain-inspired neural network architecture characterized by dynamically sparse, self-organizing connectivity, with computational nodes embedded in learned hyperbolic space. RSGN introduces two distinct timescales of adaptation: a fast, differentiable propagation of neural activations optimized via gradient descent, and a slow, local correlation-driven plasticity rule for adapting the network's connectivity. Connection strengths are a function of learned affinity, geodesic distance in hyperbolic space, and a hierarchical level bias. This approach yields input-dependent network graphs with competitive accuracy while offering significantly improved parameter efficiency compared to dense-attention models such as Transformers. RSGN’s design and efficacy are detailed in (Hays, 26 Jan 2026).

1. Structural Foundations and Dynamic Graph Construction

RSGN instantiates $N$ computational nodes with positions $p_i \in B^d$ in a $d$ -dimensional Poincaré ball. Each node maintains both fast-changing states $h_i \in \mathbb{R}^{d_h}$ and slowly evolving parameters $\{p_i, \theta_i, \ell_i\}$ representing spatial position, activation threshold, and hierarchical level, respectively. Upon receiving an input, only a sparse subset (approximately 1–2%) of nodes are "ignited" via input-specific embedding and soft matching in hyperbolic space.

Across $K$ discrete propagation steps, the network operates as follows:

Messages are passed along dynamically determined sparse edges, determined by geodesic proximity, affinity, and hierarchical level bias.
Node activations $\alpha_i^{(t)}$ are updated using a smooth, differentiable threshold function.
Local inhibition normalizes activations within each node’s radius- $r$ hyperbolic neighborhood, supporting competitive interaction and preventing over-activation.
After $K$ steps, active node states are aggregated by a learned readout for output computation.

Slow timescale adaptation directly modulates the graph structure through Hebbian-like plasticity, updating connexion affinities, thresholds, and occasionally structural reconfigurations (pruning and growth of edges) in a reward-modulated manner.

2. Hyperbolic Embedding, Connection Weights, and Hierarchical Organization

Nodes are embedded in $B^d = \{x\in\mathbb{R}^d\mid \|x\|<1\}$ , with a hyperbolic metric tensor $g_x$ scaling distances such that geodesic distance between two nodes $p_i, p_j$ is: $d_\mathrm{hyp}(p_i, p_j) = \operatorname{arcosh}\left(1 + 2\,\frac{\|p_i-p_j\|^2}{(1-\|p_i\|^2)(1-\|p_j\|^2)}\right)$ This embedding ensures that tree-structured or hierarchical regimes map with low distortion: leaves are near the boundary; ancestors are near the origin. Connection weights between nodes are defined as

$w_{ij} = \sigma(u_i^\top v_j) \exp\left(-\frac{d_\mathrm{hyp}(p_i,p_j)}{\tau}\right) \phi(\ell_j - \ell_i)$

Here, $\sigma$ is the sigmoid, $u_i, v_j$ are learned low-rank embeddings ( $r\ll N$ ), $\tau$ is a learned distance-decay parameter, and $\phi(x) = \log(1 + e^{x + 1})$ promotes connections that align with the hierarchy. This construction yields an effective receptive field size that adapts to input and local geometry, supporting efficient, context-sensitive computation.

3. Learning Dynamics: Fast Propagation and Structural Plasticity

Fast Timescale (Differentiable Activation Propagation)

At each time step $t$ :

Message passing: Active neighbors $j \in \mathcal{A}^{(t)}$ contribute to node $i$ 's pre-activation state via $w_{ij}$ -weighted aggregation.
Activations are updated by a differentiable soft threshold:

$\alpha_i^{(t+1)} = \sigma\left(\frac{\alpha_i^{(t)} + \beta \|\tilde h_i^{(t+1)}\| - \theta_i}{T}\right)$

where $\beta$ scales contribution from incoming messages, and $T$ is a fixed temperature.

State update: Incorporates LayerNorm and residual connections, ensuring stability and normalization.
Local inhibition further regularizes activity within each node’s local hyperbolic radius.

Output is read via pooling over all node activations after $K$ steps, applying a learned function $f_\text{out}$ to the pooled state.

Slow Timescale (Hebbian Structural Adaptation)

After each batch, slow variables adapt via correlation-driven rules:

Affinity update: $a_{ij} = u_i^\top v_j$ increments proportional to product of average activations and task reward, i.e., $\Delta a_{ij} = \eta_a\, \bar\alpha_i\, \bar\alpha_j\, R$ .
Threshold homeostasis: Each $\theta_i$ shifts towards maintaining a target mean activation, $\Delta \theta_i = \eta_\theta (\bar\alpha_i - \alpha_{\rm target})$ .
Structural pruning/sprouting: Edges below significance are pruned; new edges between highly correlated (but disconnected) nodes may be grown.

The slow drift of positions and hierarchical levels analogously organizes the geometry to reflect task structure over time.

4. Computational Complexity and Efficiency

RSGN achieves favorable computational efficiency relative to dense-attention architectures:

If the average number of active nodes $k \ll N$ and average local neighbor count $m \ll N$ , the dominant per-step cost is $O(k m d_h^2)$ .
Empirically, $k, m \approx O(\sqrt{N})$ render a complete forward pass $O(N d_h^2)$ , i.e., linear in the number of nodes.
In contrast, Transformer self-attention over $n$ tokens incurs $O(n^2 d)$ cost.
The result is input-dependent, sub-quadratic memory and computation scaling.

The following table summarizes key architectural points:

Mechanism	Property	Scalability
Sparse hyperbolic routing	Dynamic, input-driven locality	$O(Nk),\ k\ll N$
Two-timescale adaptation	Plastic connectivity and weights	Linear in $N$
Local inhibition	Prevents over-activation	$O(km)$ per step

5. Experimental Benchmarks and Ablation Studies

Benchmark Tasks and Baselines

RSGN has been evaluated on:

Hierarchical sequence classification (20-class; input sequences with multi-scale structure and noise, random baseline 5%)
Long-range dependency classification (10-class; key signals at beginning/end of 128-token sequences with 112 distractor tokens, random baseline 10%)

Baselines include MLP, 2-layer bidirectional LSTM, standard 2-layer Transformer, and fixed-pattern Sparse Transformer.

Results Summary

Model	Hierarchical Accuracy (%)	Params	Long-Range Acc (%)	Params
Transformer	30.1 ± 0.2	403,348	100.0 ± 0.0	600,330
RSGN (+Hebb)	23.8 ± 0.2	41,672	96.5 ± 0.5	40,382
RSGN (no Hebb)	23.8 ± 0.1	41,672	96.1 ± 0.2	40,382
LSTM	18.1 ± 0.4	566,292	100.0 ± 0.0	563,722
MLP	16.0 ± 0.8	281,364	—	—
Sparse Transformer	15.9 ± 0.2	403,348	—	—

RSGN achieves 79% of the Transformer’s accuracy in hierarchical classification using approximately 10× fewer parameters, and 96.5% of Transformer performance on long-range tasks with approximately 15× fewer parameters. Ablations demonstrate robustness across variation in node count and propagation steps. Removal of Hebbian adaptation reduces performance by ~0.4% on the hierarchical task.

6. Advantages, Limitations, and Potential Extensions

Advantages

Parameter Efficiency: Comparable performance to dense-attention baselines with 10–15× fewer parameters.
Input-Dependent Routing: Adaptive, context-dependent sparsity rather than fixed dense connectivity.
Hierarchical Representation: Hyperbolic embedding enabling direct encoding of multi-scale and hierarchical structure.
Two-Timescale Learning: Combination of fast, end-to-end gradient descent and slow, local, reward-modulated plasticity permits continual structural adaptation.

Limitations

Absolute task accuracy is lower than that of the best-performing Transformer baselines on these benchmarks.
Current hardware (e.g., GPUs) is not optimized for sparse, asynchronous computation, which may limit realized speedup in practice; neuromorphic hardware may better align with RSGN's computational model.
Scalability to very large models and standard NLP or vision tasks has not been demonstrated.
Careful tuning of both fast and slow learning rates is required.

Prospective Research Directions

Hybrid architectures blending RSGN's dynamic sparse routing with attention modules.
Continual and online learning scenarios exploiting RSGN's structural plasticity.
Multimodal embeddings with distinct hyperbolic submanifolds.
Efficient inference on neuromorphic or event-based platforms.
Interfaces with biological or brain–computer interface systems leveraging sparse coding and reward-modulated adaptation.

7. Concluding Synthesis

Resonant Sparse Geometry Networks instantiate a biologically inspired computational paradigm that integrates sparse, geometry-driven connectivity, local inhibitory dynamics, and two-timescale adaptation. RSGN achieves sub-quadratic computational and memory complexity while flexibly adapting its computational graph to each input. Experimental results indicate strong parameter efficiency, interpretable multi-scale representations, and task-dependent adaptability. These findings suggest that sparse, hierarchical, and dynamically plastic architectures may represent a promising avenue for the development of efficient, biologically plausible neural models (Hays, 26 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Resonant Sparse Geometry Networks (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Resonant Sparse Geometry Networks (RSGN).

Resonant Sparse Geometry Networks (RSGN)

1. Structural Foundations and Dynamic Graph Construction

2. Hyperbolic Embedding, Connection Weights, and Hierarchical Organization

3. Learning Dynamics: Fast Propagation and Structural Plasticity

Fast Timescale (Differentiable Activation Propagation)

Slow Timescale (Hebbian Structural Adaptation)

4. Computational Complexity and Efficiency

5. Experimental Benchmarks and Ablation Studies

Benchmark Tasks and Baselines

Results Summary

6. Advantages, Limitations, and Potential Extensions

Advantages

Limitations

Prospective Research Directions

7. Concluding Synthesis

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Resonant Sparse Geometry Networks (RSGN)

1. Structural Foundations and Dynamic Graph Construction

2. Hyperbolic Embedding, Connection Weights, and Hierarchical Organization

3. Learning Dynamics: Fast Propagation and Structural Plasticity

Fast Timescale (Differentiable Activation Propagation)

Slow Timescale (Hebbian Structural Adaptation)

4. Computational Complexity and Efficiency

5. Experimental Benchmarks and Ablation Studies

Benchmark Tasks and Baselines

Results Summary

6. Advantages, Limitations, and Potential Extensions

Advantages

Limitations

Prospective Research Directions

7. Concluding Synthesis

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research