NSD-synthetic: OOD Visual Neuroscience Dataset

Updated 3 March 2026

NSD-synthetic is a visual neuroscience dataset that provides ultra-high field 7T fMRI responses to synthetic images for out-of-distribution evaluation.
It systematically manipulates visual features across multiple classes (e.g., noise, words, spirals) to probe the boundaries of neural processing.
The dataset employs advanced preprocessing and GLM pipelines to benchmark neural encoding models, comparing task- and self-supervised networks.

NSD-synthetic is a visual neuroscience dataset comprising ultra-high field 7T functional MRI (fMRI) responses from human participants to a controlled battery of synthetic images. Designed explicitly for out-of-distribution (OOD) evaluation, the dataset augments the Natural Scenes Dataset (NSD-core) by introducing image classes and stimulus features not present in naturalistic visual exposure. NSD-synthetic enables the benchmarking of computational models of visual processing on neural data that systematically diverge from conventional datasets, facilitating robust model selection and advancing theoretical frameworks of human vision (Gifford et al., 8 Mar 2025).

1. Dataset Composition and Stimulus Design

NSD-synthetic utilizes the eight subjects (subj01–subj08) from NSD-core, each undergoing an additional high-field 7T session comprising eight functional runs (428 s per run, 268 TRs, TR = 1.6 s; alternating fixation and one-back tasks). Each subject completed 744 stimulus trials per session, with 80 (≈ 10.8%) "one-back" repeat trials for behavioral monitoring.

Synthetic Image Set

Number of images: 284 distinct synthetic stimuli.
Hierarchical taxonomy: 8 high-level classes subdivided into 71 subclasses, each represented by 4 images.

Image Class	Example Subclasses	Manipulated Features
Noise	White noise, Pink noise	Spatial frequency spectrum
Natural scenes		Scene structure, natural content
Manipulated scenes	Upside-down, Line-drawing	Inversion, low-level abstraction
Contrast modulation		100%, 50%, 10%, 6%, 4%
Phase-coherence		75%, 50%, 25%, 0% phase randomness
Single words	Positions, lengths	Eccentricity, word length
Spiral gratings	SF levels, phase shifts	Micropattern orientation, log-polar
Chromatic noise	16 hues, pink noise	Hue angle (subject-calibrated), achromatic

Feature dimensions are manipulated across low-level (contrast, hue, spatial frequency, phase coherence), mid-level (oriented spirals, log-polar patterns), and high-level (scene structure, orthographic word content) factors. The design systematically probes representation boundaries of visual cortical processing under deviations from naturalistic image distributions.

2. fMRI Acquisition and Preprocessing Pipeline

Scanning was performed on the Siemens Magnetom 7 T platform (Center for Magnetic Resonance Research, University of Minnesota) equipped with a 32-channel receive, single-channel transmit RF coil (Nova Medical). EPI data acquisition employed 1.8 mm isotropic voxels, 84 slices, TR = 1600 ms, TE = 22 ms, multi-band factor = 3, and standard NSD-core protocol flip angles. Dual-echo field maps (2.2×2.2×3.6 mm, TE₁ = 8.16 ms, TE₂ = 9.18 ms) enabled geometric distortion correction.

Preprocessing closely followed NSD-core:

FreeSurfer anatomical reconstruction → fsaverage surface resampling.
EPI corrections: slice timing, within/across-session motion correction, fieldmap unwarping, gradient nonlinearity correction.
Two resampled outputs for subsequent GLM analysis: 1.8 mm/τ = 1.333 s and 1 mm/τ = 1 s.
Single-trial GLM ("GLMsingle") via per-vertex HRF library fit, GLMdenoise (autosourced noise regressors [Charest et al. 2018]), and ridge regression regularization [Rokem & Kay 2020].
Regions-of-interest (ROIs): V1, V2, V3, hV4 (retinotopy), PPA, VWFA (category localizers), with additional ROIs (EBA, FFA) available on request.

3. Dataset Structure, BIDS Compatibility, and Data Access

The dataset adheres to BIDS-compatible directory conventions:

/sub-<ID>/anat/: T1w anatomical images, transforms.
/sub-<ID>/func/: Runwise bold images (Nifti), events.tsv trial logs, fieldmaps.
/derivatives/glm/: Single-trial beta weights (% signal change, per run/subject).
/derivatives/roi/: Per-subject ROI masks (fsaverage).

Metadata for each run, including stimulus onset, duration, image ID/subclass, one-back flags, and behavioral/eye-tracking QC, is provided as events.tsv. Image stimuli (.png/.mat), subclass labels, and generation parameters are accessible in a dedicated folder. Demographic information (age, gender), behavioral results (% correct, d′), and eye-tracking quality control are included.

Public repositories:

Dataset: http://naturalscenesdataset.org
Data manual: https://cvnlab.slite.page/p/CT9Fwl4_hc/NSD-Data-Manual
Data generation/processing code: https://github.com/cvnlab/nsddatapaper/, https://github.com/cvnlab/nsdcode/
NSD-synthetic analysis scripts: https://github.com/gifale95/NSD-synthetic

4. Out-of-Distribution (OOD) Characterization

The experimental intent of NSD-synthetic is to enable rigorous OOD evaluation. The training ("in-distribution", ID) set comprises >70,000 real photographs from NSD-core. The OOD set consists of the 284 synthetic stimuli that systematically manipulate non-natural visual dimensions.

Empirical Verification

Multidimensional scaling (MDS) of concatenated single-trial fMRI responses demonstrates synthetic trials form a statistically distinct response cluster relative to NSD-core, controlling for session effects. Within the synthetic cluster, further subdivision by image class (e.g., noise, spirals, words) is observed.

A quantitative measure of distributional separation, the ratio

$\delta = \frac{\text{mean}\,\Vert y_i - \mu_{\text{core}}\Vert}{\text{mean}\,\Vert y_j - \mu_{\text{synth}}\Vert}$

with $\mu_{\text{core}}$ and $\mu_{\text{synth}}$ as the respective cluster centroids in MDS space, yielded $\delta \approx 1.8$ , denoting an ~80% greater between-cluster distance than within-cluster spread.

5. Model Benchmarking and Neural Encoding Evaluation

Encoding Frameworks

Linear ridge regression was used to map visual feature embeddings to vertexwise fMRI response. Four pretrained networks were employed for feature extraction:

AlexNet: Task-supervised CNN
ResNet-50: Task-supervised CNN
MoCo: Self-supervised CNN (ResNet-50 backbone)
vit_b_32: Task-supervised Vision Transformer

Pipeline: Center crop and resize 224×224 → $\sqrt{\text{RGB}}$ transform (NSD-synthetic) → standard ImageNet normalization → feature extraction at each network sublayer → PCA (250 components) → ridge regression.

Evaluation Metrics

Pearson correlation across test images:

$r = \frac{\mathrm{cov}(y, \hat{y})}{\sigma_y\,\sigma_{\hat{y}}}$

Explained variance (setting $r < 0$ to 0):

$R^2 = 1 - \frac{\sum_i (y_i - \hat{y}_i)^2}{\sum_i (y_i - \bar{y})^2} = r^2$

Normalized explained variance:

$R^2_{\text{norm}} = \frac{R^2}{R^2_{\text{ceiling}}}, \quad R^2_{\text{norm}} \in [0, 1]$

Group summaries: Mean ± SEM, restricted to vertices with ncsnr > 0.3.

In-distribution vs. Out-of-distribution Performance

On early visual cortex (V1–V3) vertices:

ID test set (NSD-core, n=284): Mean $R^2_{\text{norm}} \approx 0.48 \pm 0.02$
OOD test set (NSD-synthetic, n=284): Mean $R^2_{\text{norm}} \approx 0.26 \pm 0.02$
ID–OOD difference: $\Delta \approx 0.22 \pm 0.01$ (paired across vertices)
Statistical test (AlexNet, V1 vertices): $t(7) = 15.3$ , $p < 0.0001$ , Cohen’s $d \approx 5.4$

Model Comparison Reveals OOD-Sensitive Differences

NSD-synthetic enables OOD model comparison not apparent in ID regimes.

vit_b_32 vs. AlexNet:
- ID: vit_b_32 marginally outperforms in ventral areas (peak $\Delta R^2_{\text{norm}} \approx 0.10$ ), underperforms in early areas (peak $-0.10$ ).
- OOD: vit_b_32 outperforms AlexNet across all visual areas (peak $\Delta R^2_{\text{norm}} \approx 0.25$ ).
- OOD difference: $t(7) = 12.4$ , $p < 0.0001$ .
ResNet-50 vs. MoCo:
- ID: ResNet-50 favored in higher-level ROIs (peak $\Delta R^2_{\text{norm}} \approx 0.05$ ), MoCo favored in early ROIs (peak $\Delta \approx 0.05$ ).
- OOD: MoCo outperforms throughout the hierarchy (peak $\Delta R^2_{\text{norm}} \approx 0.10$ ).
- OOD difference (V1, MoCo – ResNet-50): $t(7) = 8.7$ , $p<0.0005$ , Cohen’s $d \approx 3.1$ .

OOD Generalization Procedure

Train encoding models on 9,000 subject-unique NSD-core images.
Evaluate using identical regression weights on: (a) held-out NSD-core (ID), (b) NSD-synthetic (OOD).
Use $R^2_{\text{norm}}$ as the primary metric.
Map contrasts onto cortical surfaces with FDR correction ( $q < 0.05$ ).

A plausible implication is that self-supervised deep neural networks can better model biological vision in OOD regimes than task-supervised networks, a distinction undetectable with traditional ID benchmarks.

6. Context and Research Applications

NSD-synthetic enables systematic OOD generalization benchmarking for NeuroAI, complementing existing large-scale datasets restricted to natural images. It provides a standardized, rigorously annotated fMRI resource for investigating model-to-brain generalization under controlled stimulus perturbations. Public access to data, preprocessing pipelines, ROI definitions, and code ensures replicability and extensibility for evaluating new models and hypotheses regarding human visual processing (Gifford et al., 8 Mar 2025).

By foregrounding OOD differentiation—absent in ID-only paradigms—NSD-synthetic advances both the empirical assessment of neural network models and the formulation of computational theories with improved explanatory power for human vision.

Markdown Report Issue Upgrade to Chat

References (1)

A 7T fMRI dataset of synthetic images for out-of-distribution modeling of vision (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to NSD-synthetic.

NSD-synthetic: OOD Visual Neuroscience Dataset

1. Dataset Composition and Stimulus Design

Synthetic Image Set

2. fMRI Acquisition and Preprocessing Pipeline

3. Dataset Structure, BIDS Compatibility, and Data Access

4. Out-of-Distribution (OOD) Characterization

Empirical Verification

5. Model Benchmarking and Neural Encoding Evaluation

Encoding Frameworks

Evaluation Metrics

In-distribution vs. Out-of-distribution Performance

Model Comparison Reveals OOD-Sensitive Differences

OOD Generalization Procedure

6. Context and Research Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

NSD-synthetic: OOD Visual Neuroscience Dataset

1. Dataset Composition and Stimulus Design

Synthetic Image Set

2. fMRI Acquisition and Preprocessing Pipeline

3. Dataset Structure, BIDS Compatibility, and Data Access

4. Out-of-Distribution (OOD) Characterization

Empirical Verification

5. Model Benchmarking and Neural Encoding Evaluation

Encoding Frameworks

Evaluation Metrics

In-distribution vs. Out-of-distribution Performance

Model Comparison Reveals OOD-Sensitive Differences

OOD Generalization Procedure

6. Context and Research Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research