Papers
Topics
Authors
Recent
2000 character limit reached

BrainExplore: fNIRS-based fMRI Prediction Framework

Updated 5 January 2026
  • BrainExplore is a methodological pipeline that predicts fMRI activation markers from fNIRS data using machine learning and neural data augmentation.
  • It employs regression models like Lasso and SVR to map preprocessed fNIRS signals to task-specific cortical fMRI activations, validated on stop-signal and reversal learning tasks.
  • The framework offers a cost-effective, non-invasive alternative for neurocognitive biomarker estimation, especially useful for populations where fMRI is impractical.

The BrainExplore framework is a methodological pipeline for predicting functional magnetic resonance imaging (fMRI) activation markers of cognition from functional near-infrared spectroscopy (fNIRS) data, leveraging ML models and neural data augmentation. It facilitates the use of fNIRS—a portable, low-cost optical neuroimaging modality—as a surrogate for fMRI biomarkers, addressing challenges arising from fMRI's expense and acquisition difficulties, particularly in populations such as infants. The framework was introduced and validated on two cognitive tasks (stop-signal and probabilistic reversal learning) using concurrent fNIRS and fMRI measurements from 50 human participants (Hur et al., 2022).

1. Formal Problem Statement

Let nn be the number of subjects after quality control (%%%%1%%%% for SST, n=32n=32 for PRL), d=48d=48 the prefrontal fNIRS channel count, and mm the number of fMRI activation clusters of interest (m=8m=8 SST, m=1m=1 PRL). Define:

  • XRn×dX \in \mathbb{R}^{n \times d}: subject-wise fNIRS β\beta-values (from GLM) across channels.
  • YRn×mY \in \mathbb{R}^{n \times m}: corresponding subject-wise fMRI β\beta-values (from GLM) for activated clusters.

The goal is to learn a mapping fθ:RdRmf_\theta: \mathbb{R}^d \rightarrow \mathbb{R}^m, parameterized by θ\theta, that predicts fMRI markers from multivariate fNIRS patterns, minmizing prediction error on held-out subjects:

fθ(Xi)=Y^i,    i=1,,nf_\theta(X_i) = \hat{Y}_i, \;\; i = 1, \ldots, n

2. Signal Preprocessing Pipeline

a) fNIRS Data

  • Acquisition: Raw light-intensity signals from 24 sources × 32 detectors yield 48 overlapping channels.
  • Conversion: Separately extracted time series for total hemoglobin (HbT), oxyhemoglobin (HbO), and deoxyhemoglobin (HbR).
  • Filtering: Band-pass (0.01–0.2 Hz) removes drift and heart-beat noise.
  • Channel-wise z-normalization.
  • GLM Regression: Task regressors (eg., successful‐stop vs. successful‐go for SST; trial-by-trial prediction error for PRL) convolved with canonical HRF. Result is βi,cfNIRS\beta^{\mathrm{fNIRS}}_{i,c} for subject ii, channel cc.

b) fMRI Data

  • Preprocessing: Slice-timing correction, realignment, normalization in SPM.
  • GLM Regression: Same task regressors yield voxel-wise β\beta-values.
  • Cluster-level inference: p<0.05p < 0.05 FWE-corrected for significant clusters.
  • Summary: Mean βi,kfMRI\beta^{\mathrm{fMRI}}_{i,k} across voxels in active cluster kk for each subject ii.

3. Predictive Modeling Approaches

Four supervised regression models are implemented:

Model Objective Function / Algorithm Regularization
OLS θ=argminθ1niYiXiθ22\theta = \arg\min_\theta \frac{1}{n}\sum_i \|Y_i - X_i\theta\|_2^2 (closed-form) None
Ridge Regression θ=argminθ1niYiXiθ22+λθ22\theta = \arg\min_\theta \frac{1}{n}\sum_i \|Y_i - X_i\theta\|_2^2 + \lambda\|\theta\|_2^2 2\ell_2 penalty
Lasso Regression θ=argminθ1niYiXiθ22+λθ1\theta = \arg\min_\theta \frac{1}{n}\sum_i \|Y_i - X_i\theta\|_2^2 + \lambda\|\theta\|_1 1\ell_1 penalty
SVR (RBF kernel) Quadratic program with ϵ\epsilon-insensitive loss, RBF feature mapping Margin and RBF

Hyperparameters (λ\lambda, CC, γ\gamma) are selected via nested leave-one-out cross-validation. Optimization procedures are closed-form for OLS/ridge, coordinate descent for Lasso, and SMO for SVR.

4. Neural Data Augmentation Strategy

For each subject, the initial fNIRS time series XirawRTi×48X^{\mathrm{raw}}_i \in \mathbb{R}^{T_i \times 48} is channelwise normalized. The framework generates S=100S=100 synthetic replicates via:

ε(s)N(0,σ2ITi×48),  σ=0.01\varepsilon^{(s)} \sim \mathcal{N}(0, \sigma^2 I_{T_i \times 48}), \; \sigma=0.01

Xiaug,(s)=Xiraw+ε(s)X^{\mathrm{aug}, (s)}_i = X^{\mathrm{raw}}_i + \varepsilon^{(s)}

Each augmented time-series is GLM-processed to yield βiaug,(s)R48\beta^{\mathrm{aug}, (s)}_i \in \mathbb{R}^{48}. Thus, the effective training set size per subject is increased to 100, enhancing model generalization under LOSO cross-validation.

5. Training, Validation, and Evaluation Metrics

  • Hyperparameter grids: Ridge/Lasso λ[104,102]\lambda \in [10^{-4}, 10^{2}]; SVR C{0.1,1,10}C \in \{0.1, 1, 10\}, γ{103,102,101}\gamma \in \{10^{-3}, 10^{-2}, 10^{-1}\}.
  • Cross-validation: Leave-one-subject-out; train on augmented data for n1n-1 subjects, predict Y^i\hat{Y}_i on held-out subject.
  • Metrics:

    • Mean Squared Error (MSE):

    MSE=1mk=1m(Yi,kY^i,k)2\mathrm{MSE} = \frac{1}{m} \sum_{k=1}^m (Y_{i,k} - \hat{Y}_{i,k})^2 - Pearson correlation (rr):

    r=k(Yi,kYˉ)(Y^i,kY^)k(Yi,kYˉ)2k(Y^i,kY^)2r = \frac{\sum_{k}(Y_{i,k} - \bar{Y})(\hat{Y}_{i,k} - \overline{\hat{Y}})}{\sqrt{\sum_{k}(Y_{i,k} - \bar{Y})^2}\sqrt{\sum_{k}(\hat{Y}_{i,k} - \overline{\hat{Y}})^2}} - Coefficient of determination (R2R^2):

    R2=1k(Yi,kY^i,k)2k(Yi,kYˉ)2R^2 = 1 - \frac{\sum_{k}(Y_{i,k} - \hat{Y}_{i,k})^2}{\sum_{k}(Y_{i,k} - \bar{Y})^2}

Key empirical findings:

  • SST task: Lasso regression on HbR predicted fMRI β\beta in right IFG (MSE=4.787\mathrm{MSE}=4.787, r=0.52r=0.52, p<0.01p<0.01), SMA (MSE=7.194\mathrm{MSE}=7.194, r=0.48r=0.48, p<0.01p<0.01), left IFG (MSE=8.158\mathrm{MSE}=8.158, r=0.50r=0.50, p<0.01p<0.01).
  • PRL task: SVR(RBF) on HbT predicted IPL (MSE=0.115\mathrm{MSE}=0.115, r=0.45r=0.45, p<0.05p<0.05).

6. Main Results, Functional Relevance, and Limitations

  • SST (response inhibition): fNIRS HbR signals plus Lasso regression best predict fMRI activation in bilateral IFG and SMA (all p<0.01p<0.01).
  • PRL (prediction error): fNIRS HbT signals plus SVR (RBF) predict fMRI activation in IPL (r=0.45r=0.45, p<0.05p<0.05). Subcortical striatal signals could not be recovered, implying fNIRS's limitation to cortical sources.
  • No attempt was made to infer or predict task-based functional connectivity from fNIRS.

Identified limitations include absence of subcortical coverage, possible confounding due to visit/environmental differences and subject emotional state. Extensions proposed are incorporation of deep neural architectures and domain adaptation for improved subcortical prediction, deployment in infant/patient populations with fMRI contraindications, and augmentation with dynamic functional connectivity features.

7. Implications and Prospective Extensions

The BrainExplore framework demonstrates that standard machine learning regression models, when augmented with neural data synthesis, can non-invasively estimate cortical fMRI markers from fNIRS measurements with significant accuracy. These surrogate markers may facilitate study of populations where fMRI is impractical (infants, specific patients), broaden access to neurocognitive biomarker research, and lay groundwork for the transfer of validated markers across modalities. Suggested future directions include:

  • Adopting nonlinear or deep learning architectures to capture residual or subcortical activations.
  • Extending the pipeline to predictive modeling of functional connectivity.
  • Validating in broader populations and integrating additional neuroimaging features for enhanced generalizability.

A plausible implication is that data-augmented ML protocols using fNIRS can substitute for key aspects of fMRI-based cognitive phenotyping in settings where access or feasibility constraints are paramount (Hur et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to BrainExplore Framework.