Large-scale Graph Representation Learning of Dynamic Brain Connectome with Transformers (2312.14939v1)

Published 4 Dec 2023 in q-bio.NC, cs.CV, cs.LG, and eess.IV

Abstract: Graph Transformers have recently been successful in various graph representation learning tasks, providing a number of advantages over message-passing Graph Neural Networks. Utilizing Graph Transformers for learning the representation of the brain functional connectivity network is also gaining interest. However, studies to date have underlooked the temporal dynamics of functional connectivity, which fluctuates over time. Here, we propose a method for learning the representation of dynamic functional connectivity with Graph Transformers. Specifically, we define the connectome embedding, which holds the position, structure, and time information of the functional connectivity graph, and use Transformers to learn its representation across time. We perform experiments with over 50,000 resting-state fMRI samples obtained from three datasets, which is the largest number of fMRI data used in studies by far. The experimental results show that our proposed method outperforms other competitive baselines in gender classification and age regression tasks based on the functional connectivity extracted from the fMRI data.

References (26)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces TeNeT, a novel graph transformer method that captures dynamic brain connectivity by integrating structural, positional, and time embeddings from fMRI data.
It leverages a two-step transformer architecture with spatial and temporal modules to effectively process dynamic connectome graphs, outperforming traditional GNN and static transformer baselines.
Experiments on over 50,000 samples demonstrate superior performance in gender classification (AUROC) and age regression (R²), underscoring its scalability and robustness.

The paper "Large-scale Graph Representation Learning of Dynamic Brain Connectome with Transformers" (2312.14939) introduces TeNeT, a novel Graph Transformer (GT)-based method designed to learn representations of dynamic functional connectivity (FC) from resting-state fMRI data. The core problem addressed is the limitation of existing methods, such as Graph Neural Networks (GNNs) and static GTs, in effectively capturing the temporal fluctuations of brain FC and demonstrating performance on large-scale datasets, which is crucial for generalizability.

TeNeT tackles this by processing fMRI data as a sequence of dynamic FC graphs across time. The practical implementation involves several key steps:

Data Preprocessing: Start with the ROI-timeseries matrix extracted from 4D fMRI data. The paper uses the Schaefer atlas with 400 ROIs.
Connectome Embedding: For each time point $t$ $t$ , a connectome embedding $\mathcal{E}_t$ $E_{t}$ is constructed. This embedding combines three types of information:
- Structure Embedding: Calculated from the dynamic FC graph at time $t$ . This dynamic FC is typically derived using a sliding-window approach on the ROI-timeseries, computing correlation coefficients within a window of length $\Gamma$ and stride $S$ . The correlation matrix $\bar{\mathcal{C}}_t$ represents edge weights.
- Position Embedding: Derived from the structure embedding by combining the correlation matrix (after removing self-loops) with an identity matrix. This combined graph embedding is then projected to a lower dimension $D$ using a two-layer MLP.
- Time Embedding: Obtained from the ROI-timeseries within the sliding window using a GRU, capturing temporal context.
- The final connectome embedding $\mathcal{E}_t$ is formed by concatenating the MLP-processed graph embedding and the GRU-derived time embedding for each time point. This results in a sequence of embeddings $(\mathcal{E}_1, \mathcal{E}_2, ..., \mathcal{E}_T)$ .
TeNeT Architecture: The model uses a two-step Transformer composition $f = h \circ g$ $f = h \circ g$ .
- Spatial Transformer ( $g$ ): A Connectome Transformer processes each time point's connectome embedding $\mathcal{E}_t$ independently using self-attention. This layer is augmented by injecting structural information from the dynamic graph, specifically the 1-hop connectivity matrix $\bar{\mathcal{C}}_t$ and node degree information. A learnable token is prepended to the embedding sequence at each time point. After $L$ layers, the token vector for each time point $t$ ( $\mathbf{v}_t$ ) captures spatially-attended features at that specific moment.
- Temporal Transformer ( $h$ ): A standard Transformer Encoder takes the sequence of token vectors $(\mathbf{v}_1, \mathbf{v}_2, ..., \mathbf{v}_T)$ generated by the spatial transformer as input. It applies self-attention across the time dimension to learn the dynamic patterns. Again, a learnable token is used. After $L$ layers, the final token vector $\mathcal{E}_{\text{dyn}}$ represents the comprehensive dynamic FC representation for the entire fMRI sample.
Downstream Tasks: The final token vector $\mathcal{E}_{\text{dyn}}$ is used as input to a simple classifier (for gender) or regressor (for age) head.

The authors conducted extensive experiments using over 50,000 resting-state fMRI samples from three large datasets (UKB, ABCD, HCP subsets), demonstrating the method's capability on an unprecedented scale for fMRI studies. Performance was evaluated on gender classification (AUROC) and age regression ( $R^2$ ).

Implementation Details & Considerations:

Dataset Size: The paper highlights the use of large datasets (>50,000 samples) as crucial for enhancing replicability and generalizability, addressing a major challenge in neuroimaging. Implementing this requires significant data storage and processing capabilities.
Hyperparameters: The model configuration included 4 layers with a hidden dimension of 1024. Training utilized the Adam optimizer with a one-cycle learning rate schedule. Hyperparameter tuning (batch size, learning rate) was performed using grid search.
Computational Requirements: Training on large datasets requires substantial computational resources. The experiments were run on an NVIDIA GeForce RTX 3090, indicating the need for high-end GPUs, potentially multiple for larger-scale training or inference.
Dynamic FC Calculation: The sliding-window approach introduces choices for window length ( $\Gamma$ ) and stride ( $S$ ). These parameters impact the temporal resolution and the number of dynamic graphs generated per subject.
Memory Usage: Processing sequences of graph embeddings can be memory-intensive, especially with large numbers of ROIs (nodes) and time points. Batch sizing needs to be carefully considered based on available GPU memory.
Ablation Studies: The ablation results emphasize the importance of both the GRU-derived time encoding and the dynamic nature of the graph features, suggesting that removing either component degrades performance. This guides implementation choices regarding input features.
Scalability: While demonstrated on large datasets, scaling to even larger cohorts or higher resolution atlases would require further optimization of memory usage and distributed training strategies.

Practical Applications:

TeNeT can be applied to extract powerful representations of brain connectivity dynamics for various tasks:

Predicting Phenotypes: As demonstrated, predicting demographic traits like gender and age from resting-state fMRI data.
Clinical Diagnosis/Prognosis: Applying the model to predict clinical phenotypes, such as neurological or psychiatric disorders, which are often associated with altered dynamic FC patterns.
Biomarker Discovery: The learned dynamic representations could potentially serve as biomarkers for disease states or treatment response.
Understanding Brain Dynamics: Future work on interpreting the attention weights within TeNeT could provide insights into which brain regions and temporal patterns are most informative for specific tasks or conditions.

The comparative results show TeNeT's superiority over static and dynamic GNN baselines, and a static GT baseline, supporting the value of its approach to capturing dynamic FC with Transformers on large datasets. The model's design, integrating position, structure, and time information within a Transformer framework, offers a promising direction for leveraging complex neuroimaging data.

PDF Markdown

Large-scale Graph Representation Learning of Dynamic Brain Connectome with Transformers (2312.14939v1)

Summary

Related Papers