Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 429 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Scaling up ridge regression for brain encoding in a massive individual fMRI dataset (2403.19421v1)

Published 28 Mar 2024 in cs.LG, cs.AI, q-bio.NC, and q-bio.QM

Abstract: Brain encoding with neuroimaging data is an established analysis aimed at predicting human brain activity directly from complex stimuli features such as movie frames. Typically, these features are the latent space representation from an artificial neural network, and the stimuli are image, audio, or text inputs. Ridge regression is a popular prediction model for brain encoding due to its good out-of-sample generalization performance. However, training a ridge regression model can be highly time-consuming when dealing with large-scale deep functional magnetic resonance imaging (fMRI) datasets that include many space-time samples of brain activity. This paper evaluates different parallelization techniques to reduce the training time of brain encoding with ridge regression on the CNeuroMod Friends dataset, one of the largest deep fMRI resource currently available. With multi-threading, our results show that the Intel Math Kernel Library (MKL) significantly outperforms the OpenBLAS library, being 1.9 times faster using 32 threads on a single machine. We then evaluated the Dask multi-CPU implementation of ridge regression readily available in scikit-learn (MultiOutput), and we proposed a new "batch" version of Dask parallelization, motivated by a time complexity analysis. In line with our theoretical analysis, MultiOutput parallelization was found to be impractical, i.e., slower than multi-threading on a single machine. In contrast, the Batch-MultiOutput regression scaled well across compute nodes and threads, providing speed-ups of up to 33 times with 8 compute nodes and 32 threads compared to a single-threaded scikit-learn execution. Batch parallelization using Dask thus emerges as a scalable approach for brain encoding with ridge regression on high-performance computing systems using scikit-learn and large fMRI datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Encoding and decoding in fmri. neuroimage, Technometrics 56 (2011) 400–410.
  2. A. Hoerl, R. Kennard, Ridge regression: applications to nonorthogonal problems, Technometrics 12 (1970) 69–82.
  3. Regularized brain reading with shrinkage and smoothing., The annals of applied statistics (2015) 1997.
  4. Neural language models are not born equal to fit brain data, but training helps, arXiv preprint arXiv:2207.03380 (2022).
  5. Large-scale benchmarking of diverse artificial vision models in prediction of 7t human neuroimaging data, bioRxiv (2022).
  6. Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain., bioRxiv (2022).
  7. Visio-linguistic brain encoding., preprint arXiv:2204.08261 (2022).
  8. Reconstructing the cascade of language processing in the brain using the internal computations of a transformer-based language model., bioRxiv (2022).
  9. M. Lescroart, J. Gallant, Human scene-selective areas represent 3d configurations of surfaces, Neuron 101 (2019) 178–192.
  10. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy., Neuron 98 (2018) 630–644.
  11. S. Jain, A. Huth, Incorporating context into language encoding models for fmri, Advances in neural information processing systems (2018) 31.
  12. Neural encoding and decoding with deep learning for dynamic natural vision, Cerebral Cortex 28(12)) (2018) 4136–4160.
  13. Feature-space selection with banded ridge regression, NeuroImage 264 (2022) 119728.
  14. End-to-end neural system identification with neural information flow., PLOS Computational Biology 17(2) (2021).
  15. Identifying natural images from human brain activity, Nature 452 (2008) 352–355.
  16. Reconstructing visual experiences from brain activity evoked by natural movies., Current Biology 21 (2011) 1641–1646.
  17. Characterization of deep neural network features by decodability from human brain activity., Scientific data (2019) 190012.
  18. From voxels to pixels and back: Self-supervision in natural-image reconstruction from fmri., In Advances in Neural Information Processing Systems (2019) 6517–6527.
  19. Category decoding of visual stimuli from human brain activity using a bidirectional recurrent neural network to simulate bidirectional information flows in human visual cortices, Frontiers in neuroscience (2019).
  20. End-to-end deep image reconstruction from human brain activity., Frontiers in computational neuroscience 13 (2019) 21.
  21. Extensive sampling for complete models of individual brains. current opinion in behavioral sciences, bioRxiv 40 (2021) 45–51.
  22. Bold5000: a public fmri dataset while viewing 5000 visual images., Scientific data (2019) 1–18.
  23. A massive 7t fmri dataset to bridge cognitive and computational neuroscience., bioRxiv (2021).
  24. Functional brain networks are dominated by stable group and individual factors, not cognitive or daily variation., Neuron (2018) 439–452.
  25. Scikit-learn: Machine learning in python, the Journal of machine Learning research 12 (2011) 2825–2830.
  26. Model-driven level 3 blas performance optimization on loongson 3a processor., IEEE 18th international conference on parallel and distributed systems (2012) 1–18.
  27. Intel math kernel library. in high-performance, Computing on the Intel® Xeon Phi™ (2014) 167–188.
  28. M. Rocklin, Dask: Parallel computation with blocked algorithms and task scheduling, In Proceedings of the 14th python in science conferenc 130 (2015) 136.
  29. The courtois project on neuronal modeling - 2021 data release, Poster 2224 was presented at the 2021 Annual Meeting of the Organization for Human Brain Mapping held virtually (2021).
  30. Improving diffusion mri using simultaneous multi-slice echo planar imaging., Neuroimage 63 (2012) 569–580.
  31. Improving diffusion mri using simultaneous multi-slice echo planar imaging., NeuroimageT 83 (2013) 991–1001.
  32. D. Van Essen, M. Glasser, The human connectome project: Progress and prospects. in cerebrum: the dana forum on brain science, Dana Foundation 63 (2016).
  33. fmriprep: a robust preprocessing pipeline for functional mri. nature methods, Neuroimage 16 (2019) 111–116.
  34. Machine learning for neuroimaging with scikit-learn, Frontiers in neuroinformatics (2014) 14.
  35. Mist: A multi-resolution parcellation of functional brain networks, MNI Open Research 1 (2019) 3.
  36. What can 1.8 billion regressions tell us about the pressures shaping high-level visual representation in brains and machines?, BioRxiv (2022).
  37. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv (2014).
  38. Imagenet: A large-scale hierarchical image database., IEEE conference on computer vision and pattern recognition (2009) 248–255.
  39. Neurophysiological investigation of the basis of the fmri signal, nature (2001) 150–157.
  40. Ray: A distributed framework for emerging {{\{{AI}}\}} applications, in: 13th USENIX symposium on operating systems design and implementation (OSDI 18), pp. 561–577.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 3 likes.

Upgrade to Pro to view all of the tweets about this paper: