BrainExplore: fNIRS-based fMRI Prediction Framework
- BrainExplore is a methodological pipeline that predicts fMRI activation markers from fNIRS data using machine learning and neural data augmentation.
- It employs regression models like Lasso and SVR to map preprocessed fNIRS signals to task-specific cortical fMRI activations, validated on stop-signal and reversal learning tasks.
- The framework offers a cost-effective, non-invasive alternative for neurocognitive biomarker estimation, especially useful for populations where fMRI is impractical.
The BrainExplore framework is a methodological pipeline for predicting functional magnetic resonance imaging (fMRI) activation markers of cognition from functional near-infrared spectroscopy (fNIRS) data, leveraging ML models and neural data augmentation. It facilitates the use of fNIRS—a portable, low-cost optical neuroimaging modality—as a surrogate for fMRI biomarkers, addressing challenges arising from fMRI's expense and acquisition difficulties, particularly in populations such as infants. The framework was introduced and validated on two cognitive tasks (stop-signal and probabilistic reversal learning) using concurrent fNIRS and fMRI measurements from 50 human participants (Hur et al., 2022).
1. Formal Problem Statement
Let be the number of subjects after quality control (%%%%1%%%% for SST, for PRL), the prefrontal fNIRS channel count, and the number of fMRI activation clusters of interest ( SST, PRL). Define:
- : subject-wise fNIRS -values (from GLM) across channels.
- : corresponding subject-wise fMRI -values (from GLM) for activated clusters.
The goal is to learn a mapping , parameterized by , that predicts fMRI markers from multivariate fNIRS patterns, minmizing prediction error on held-out subjects:
2. Signal Preprocessing Pipeline
a) fNIRS Data
- Acquisition: Raw light-intensity signals from 24 sources × 32 detectors yield 48 overlapping channels.
- Conversion: Separately extracted time series for total hemoglobin (HbT), oxyhemoglobin (HbO), and deoxyhemoglobin (HbR).
- Filtering: Band-pass (0.01–0.2 Hz) removes drift and heart-beat noise.
- Channel-wise z-normalization.
- GLM Regression: Task regressors (eg., successful‐stop vs. successful‐go for SST; trial-by-trial prediction error for PRL) convolved with canonical HRF. Result is for subject , channel .
b) fMRI Data
- Preprocessing: Slice-timing correction, realignment, normalization in SPM.
- GLM Regression: Same task regressors yield voxel-wise -values.
- Cluster-level inference: FWE-corrected for significant clusters.
- Summary: Mean across voxels in active cluster for each subject .
3. Predictive Modeling Approaches
Four supervised regression models are implemented:
| Model | Objective Function / Algorithm | Regularization |
|---|---|---|
| OLS | (closed-form) | None |
| Ridge Regression | penalty | |
| Lasso Regression | penalty | |
| SVR (RBF kernel) | Quadratic program with -insensitive loss, RBF feature mapping | Margin and RBF |
Hyperparameters (, , ) are selected via nested leave-one-out cross-validation. Optimization procedures are closed-form for OLS/ridge, coordinate descent for Lasso, and SMO for SVR.
4. Neural Data Augmentation Strategy
For each subject, the initial fNIRS time series is channelwise normalized. The framework generates synthetic replicates via:
Each augmented time-series is GLM-processed to yield . Thus, the effective training set size per subject is increased to 100, enhancing model generalization under LOSO cross-validation.
5. Training, Validation, and Evaluation Metrics
- Hyperparameter grids: Ridge/Lasso ; SVR , .
- Cross-validation: Leave-one-subject-out; train on augmented data for subjects, predict on held-out subject.
- Metrics:
- Mean Squared Error (MSE):
- Pearson correlation ():
- Coefficient of determination ():
Key empirical findings:
- SST task: Lasso regression on HbR predicted fMRI in right IFG (, , ), SMA (, , ), left IFG (, , ).
- PRL task: SVR(RBF) on HbT predicted IPL (, , ).
6. Main Results, Functional Relevance, and Limitations
- SST (response inhibition): fNIRS HbR signals plus Lasso regression best predict fMRI activation in bilateral IFG and SMA (all ).
- PRL (prediction error): fNIRS HbT signals plus SVR (RBF) predict fMRI activation in IPL (, ). Subcortical striatal signals could not be recovered, implying fNIRS's limitation to cortical sources.
- No attempt was made to infer or predict task-based functional connectivity from fNIRS.
Identified limitations include absence of subcortical coverage, possible confounding due to visit/environmental differences and subject emotional state. Extensions proposed are incorporation of deep neural architectures and domain adaptation for improved subcortical prediction, deployment in infant/patient populations with fMRI contraindications, and augmentation with dynamic functional connectivity features.
7. Implications and Prospective Extensions
The BrainExplore framework demonstrates that standard machine learning regression models, when augmented with neural data synthesis, can non-invasively estimate cortical fMRI markers from fNIRS measurements with significant accuracy. These surrogate markers may facilitate study of populations where fMRI is impractical (infants, specific patients), broaden access to neurocognitive biomarker research, and lay groundwork for the transfer of validated markers across modalities. Suggested future directions include:
- Adopting nonlinear or deep learning architectures to capture residual or subcortical activations.
- Extending the pipeline to predictive modeling of functional connectivity.
- Validating in broader populations and integrating additional neuroimaging features for enhanced generalizability.
A plausible implication is that data-augmented ML protocols using fNIRS can substitute for key aspects of fMRI-based cognitive phenotyping in settings where access or feasibility constraints are paramount (Hur et al., 2022).