Perceptogram: Reconstructing Visual Percepts from EEG (2404.01250v2)
Abstract: Visual neural decoding from EEG has improved significantly due to diffusion models that can reconstruct high-quality images from decoded latents. While recent works have focused on relatively complex architectures to achieve good reconstruction performance from EEG, less attention has been paid to the source of this information. In this work, we attempt to discover EEG features that represent perceptual and semantic visual categories, using a simple pipeline. Notably, the high temporal resolution of EEG allows us to go beyond static semantic maps as obtained from fMRI. We show (a) Training a simple linear decoder from EEG to CLIP latent space, followed by a frozen pre-trained diffusion model, is sufficient to decode images with state-of-the-art reconstruction performance. (b) Mapping the decoded latents back to EEG using a linear encoder isolates CLIP-relevant EEG spatiotemporal features. (c) By using other latent spaces representing lower-level image features, we obtain similar time-courses of texture/hue-related information. We thus use our framework, Perceptogram, to probe EEG signals at various levels of the visual information hierarchy.
- Identifying natural images from human brain activity. Nature, 452(7185):352–355, March 2008. ISSN 0028-0836, 1476-4687. doi:10.1038/nature06713.
- Versatile diffusion: Text, images and variations all in one diffusion model, 2022. URL https://arxiv.org/abs/2211.08332.
- Yu Takagi and Shinji Nishimoto. High-resolution image reconstruction with latent diffusion models from human brain activity. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14453–14463, 2023. doi:10.1109/CVPR52729.2023.01389.
- Changes in pattern-evoked responses in man associated with the vertical and horizontal meridians of the visual field. The Journal of Physiology, 208(2):499–513, June 1970. ISSN 0022-3751, 1469-7793. doi:10.1113/jphysiol.1970.sp009134.
- Source locations of pattern-specific components of human visual evoked potentials. i. component of striate cortical origin. Experimental Brain Research, 16(1), November 1972. ISSN 0014-4819, 1432-1106. doi:10.1007/BF00233371. URL http://link.springer.com/10.1007/BF00233371.
- Spatial dissociation of early and late colour evoked components. Electroencephalography and Clinical Neurophysiology/Evoked Potentials Section, 71(2):81–88, March 1988. ISSN 01685597. doi:10.1016/0168-5597(88)90009-3.
- Impairments in generation of early-stage transient visual evoked potentials to magno- and parvocellular-selective stimuli in schizophrenia. Clinical Neurophysiology, 116(9):2204–2215, September 2005. ISSN 13882457. doi:10.1016/j.clinph.2005.06.013.
- Margot J. Taylor. Non-spatial attentional effects on p1. Clinical Neurophysiology, 113(12):1903–1908, December 2002. ISSN 13882457. doi:10.1016/S1388-2457(02)00309-7.
- A large and rich eeg dataset for modeling human visual object recognition. NeuroImage, 264:119754, December 2022. ISSN 10538119. doi:10.1016/j.neuroimage.2022.119754.
- The perils and pitfalls of block design for eeg classification experiments. IEEE Transactions on Pattern Analysis and Machine Intelligence, page 1–1, 2020. ISSN 0162-8828, 2160-9292, 1939-3539. doi:10.1109/TPAMI.2020.2973153.
- Visual decoding and reconstruction via eeg embeddings with guided diffusion. (arXiv:2403.07721), March 2024. URL http://arxiv.org/abs/2403.07721. arXiv:2403.07721 [cs, eess, q-bio].
- Natural scene reconstruction from fmri signals using generative latent diffusion. Scientific Reports, 13(1):15666, September 2023. ISSN 2045-2322. doi:10.1038/s41598-023-42891-8.
- Rewon Child. Very deep vaes generalize autoregressive models and can outperform them on images. 2020. doi:10.48550/ARXIV.2011.10650. URL https://arxiv.org/abs/2011.10650.
- A massive 7t fmri dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25(1):116–126, January 2022. ISSN 1097-6256, 1546-1726. doi:10.1038/s41593-021-00962-x.
- Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. doi:10.1109/TIP.2003.819861.
- Imagenet classification with deep convolutional neural networks. Commun. ACM, 60(6):84–90, may 2017. ISSN 0001-0782. doi:10.1145/3065386. URL https://doi.org/10.1145/3065386.
- Rethinking the inception architecture for computer vision, 2015. URL https://arxiv.org/abs/1512.00567.
- Learning transferable visual models from natural language supervision, 2021. URL https://arxiv.org/abs/2103.00020.
- Efficientnet: Rethinking model scaling for convolutional neural networks. 2019. doi:10.48550/ARXIV.1905.11946. URL https://arxiv.org/abs/1905.11946.
- Unsupervised learning of visual features by contrasting cluster assignments, 2020. URL https://arxiv.org/abs/2006.09882.
- Brain decoding: Toward real-time reconstruction of visual perception. In The Twelfth International Conference on Learning Representations (ICLR), 2024.
- Reconstructing the mind’s eye: fmri-to-image with contrastive learning and diffusion priors. (arXiv:2305.18274), October 2023. URL http://arxiv.org/abs/2305.18274. arXiv:2305.18274 [cs, q-bio].
- MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7(267):1–13, 2013. doi:10.3389/fnins.2013.00267.
- Things-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. eLife, 12:e82580, February 2023. ISSN 2050-084X. doi:10.7554/eLife.82580.
- Umap: Uniform manifold approximation and projection for dimension reduction, 2018. URL https://arxiv.org/abs/1802.03426.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.