Learning Structural-Functional Brain Representations through Multi-Scale Adaptive Graph Attention for Cognitive Insight

Published 31 Mar 2026 in cs.CV | (2603.29967v1)

Abstract: Understanding how brain structure and function interact is key to explaining intelligence yet modeling them jointly is challenging as the structural and functional connectome capture complementary aspects of organization. We introduced Multi-scale Adaptive Graph Network (MAGNet), a Transformer-style graph neural network framework that adaptively learns structure-function interactions. MAGNet leverages source-based morphometry from structural MRI to extract inter-regional morphological features and fuses them with functional network connectivity from resting-state fMRI. A hybrid graph integrates direct and indirect pathways, while local-global attention refines connectivity importance and a joint loss simultaneously enforces cross-modal coherence and optimizes the prediction objective end-to-end. On the ABCD dataset, MAGNet outperformed relevant baselines, demonstrating effective multimodal integration for advancing our understanding of cognitive function.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces MAGNet, a Transformer-style graph neural network that adaptively learns direct, cross-modal, and indirect brain interactions for improved cognitive prediction.
It employs a hybrid brain graph construction using structural, functional, and multi-scale detour edges to capture complex connectivity patterns.
Experimental results on the ABCD dataset demonstrate that MAGNet outperforms state-of-the-art models, reducing errors and enhancing interpretability of neurobiological associations.

Multi-Scale Adaptive Graph Attention for Structural–Functional Brain Representation Learning

Introduction

The complexity of the human brain arises from intricate interplays between structural connectivity—its anatomical pathways—and functional connectivity, defined by temporally correlated neural activations. Accurately modeling these multi-modal connectomes is essential to elucidate individual differences in cognitive abilities, particularly intelligence, which comprehensively involves fluid, crystallized, and general domains. "Learning Structural-Functional Brain Representations through Multi-Scale Adaptive Graph Attention for Cognitive Insight" (2603.29967) addresses critical gaps in the modeling of structure–function relationships by introducing MAGNet, a Transformer-style graph neural network that adaptively learns direct and indirect cross-modal connectome interactions and aligns structure–function information for improved cognitive prediction and interpretability.

Hybrid Brain Graph Construction

Traditional approaches often utilize either static structural adjacency or segregated late-fusion pipelines, limiting their ability to capture the complex hierarchy of cross-modal dependencies and indirect pathways. MAGNet builds a hybrid brain graph integrating four classes of connections: unimodal direct (structural SBM-based, functional FNC-based), direct cross-modal (CMC), and multi-scale indirect detour (MDC) edges.

Unimodal edge construction retains top- $k$ connections—enforcing relevance and sparsity in SBM and FNC-derived graphs—while CMC edges use cosine similarity across connectivity profiles, anchoring the graph in true inter-modal structure–function coupling. Critically, MDC edges leverage a multi-radius search (from short-range to long-range) across the structural graph, capturing indirect anatomical detours between functionally connected regions via an enhanced depth-first search mechanism, thus reflecting hierarchical structural integration supporting cognition.

Figure 1: Schematic overview of hybrid brain graph construction and MAGNet architecture incorporating multi-scale direct and indirect structural-functional edges.

MAGNet Architecture

Local Edge-Aware Attention

MAGNet applies edge-aware message passing, wherein attention coefficients are computed not merely from node features but jointly conditioned on rich edge attributes encoding connection type and origin (direct/indirect, structural/functional/cross-modal). This facilitates modality-sensitive and interaction-type-specific aggregation, ameliorating the loss of biologically meaningful distinctions inherent to fixed, uniform kernels in standard GNNs.

Global Self-Attention

Following local refinement, a stack of multi-head self-attention layers enables modeling of long-range, potentially non-contiguous dependencies—a property essential to reconstruct distributed functional network architectures—and positions MAGNet as a fully Transformer-based GNN capable of global graph feature synthesis. Node embeddings are globally pooled and passed to a dense prediction head.

Joint Loss Formulation

A dual-term loss balances task-driven supervision (intelligence regression; MSE) with explicit structure-function consistency enforcement. The latter term penalizes deviations between predicted and observed FNCs generated from refined node embeddings, ensuring the learned representations capture neurobiological structure–function coupling mechanisms rather than only maximizing downstream performance.

Experimental Results

Dataset and Preprocessing

Evaluation leverages the ABCD baseline dataset, comprising sMRI and rs-fMRI from over 7,600 subjects, with per-subject intelligence assessments. ICA-based pipelines extract 53 matched intrinsic brain components in both modalities, further mapped to compact subject-level SBM and FNC matrices.

Comparison to State-of-the-Art

MAGNet significantly outperforms a landscape of GNN and multimodal integration approaches, including GAT, GT, SFDN, SFIN, Joint GCN, and BrainNN (2603.29967). Across fluid, crystallized, and total intelligence scores, it yields the lowest MSE and MAE and the highest Pearson correlation, establishing superiority in both accuracy and reliability of prediction. Notably, inclusion of MDC and CMC yields measurable correlation improvements (up to 0.04 over prior SOTA on fluid intelligence), supporting the assertion that indirect structural pathways and explicit cross-modal alignment are essential for predictive modeling in cognitive neuroscience.

Ablation Studies

Ablations confirm MAGNet's architectural choices: excluding MDC or CMC connections results in substantial performance degradation—correlation drops by 0.10 (MDC) and 0.06 (CMC) across cognitive outcomes—while removing the structure-function consistency term leads to pronounced increases in error metrics and up to 0.12 reduction in correlation, underscoring the necessity of multimodal fusion and regularization.

Figure 2: Performance from ablation studies quantifying the contribution of MDC, CMC, and consistency loss to model predictive accuracy and robustness.

Neurobiological Interpretability

Attention weights localized by MAGNet identify the top 3% most salient network interactions supporting each intelligence domain. Fluid intelligence is most robustly associated with subcortical and cognitive control networks—implicating circuits critical for adaptive reasoning and executive function. Crystallized intelligence relates to cognitive control and DMN connections, reflecting knowledge retrieval and semantic processing. Total intelligence highlights distributed integration involving cognitive control, DMN, visual, and sensorimotor networks, consistent with convergent evidence from connectomics.

Figure 3: MAGNet-derived significant structure-function connections for fluid, crystallized, and composite intelligence, mapping distinct domain-specific brain network profiles.

Implications and Future Directions

The technical advances in MAGNet provide a framework for fine-grained interrogation of structure-function–cognition relationships at the single-subject level. The ability to model multi-scale indirect anatomical detours and enforce explicit functional alignment addresses critical methodological limitations in the literature. Practically, these representations may inform individualized prediction of cognitive trajectories, early detection of neurodevelopmental disorders, and intervention targeting.

Future extensions proposed include integrating dynamic FNC for temporal adaptation, longitudinal modeling to track individual neurocognitive development, and generalization to other behavioral and clinical phenotypes. Such directions are likely to synergize with emerging multi-modal, multi-task architectures across connectomic research and translational neuroscience.

Conclusion

MAGNet demonstrates that Transformer-style, edge- and modality-aware GNNs leveraging multi-scale hybrid brain graphs produce significant gains in both predictive performance and neurobiological interpretability for cognitive modeling. Its integration of direct, cross-modal, and detour pathways and explicit structure-function regularization highlights essential architecture and training principles for advancing multi-modal neural representation learning (2603.29967).

Markdown Report Issue