Optimal design of NSD and similar datasets for generalizable prediction

Determine whether the Natural Scene Dataset (NSD) and comparable large-scale fMRI datasets are optimally designed to capture the full diversity of human visual experiences and to support the development of prediction models that generalize beyond the training data, including zero-shot visual image reconstruction from brain activity.

Background

The paper critiques recent text-guided visual image reconstruction methods that rely on large-scale datasets such as NSD and THINGS-fMRI, noting strong performance on NSD but poor generalization to other datasets specifically designed to avoid training–test overlap. The authors examine semantic diversity via CLIP text features and show NSD’s limited cluster diversity and substantial overlap between training and test sets, raising concerns about dataset suitability for genuine zero-shot prediction.

Within this context, the authors explicitly state uncertainty about whether NSD and similar datasets have been designed to adequately span the space of human visual experiences and enable truly generalizable brain decoding models. Addressing this uncertainty requires systematic evaluation of dataset diversity, train–test splits, and representational coverage in the latent feature spaces used for decoding.

References

It is unclear whether the recently proposed datasets, such as the NSD, are optimally designed to capture the full range of human visual experiences and to support the development of truly generalizable prediction models.

— Spurious reconstruction from brain activity (2405.10078 - Shirakawa et al., 16 May 2024) in Introduction

Optimal design of NSD and similar datasets for generalizable prediction

Background

References

Related Problems