Cross-Modal Alignment between Visual Stimuli and Neural Responses in the Visual Cortex

Published 6 Nov 2025 in cs.CE | (2511.04096v1)

Abstract: Investigating the mapping between visual stimuli and neural responses in the visual cortex contributes to a deeper understanding of biological visual processing mechanisms. Most existing studies characterize this mapping by training models to directly encode visual stimuli into neural responses or decode neural responses into visual stimuli. However, due to neural response variability and limited neural recording techniques, these studies suffer from overfitting and lack generalizability. Motivated by this challenge, in this paper we shift the tasks from conventional direct encoding and decoding to discriminative encoding and decoding, which are more reasonable. And on top of this we propose a cross-modal alignment approach, named Visual-Neural Alignment (VNA). To thoroughly test the performance of the three methods (direct encoding, direct decoding, and our proposed VNA) on discriminative encoding and decoding tasks, we conduct extensive experiments on three invasive visual cortex datasets, involving two types of subject mammals (mice and macaques). The results demonstrate that our VNA generally outperforms direct encoding and direct decoding, indicating our VNA can most precisely characterize the above visual-neural mapping among the three methods.