A Unified, Scalable Framework for Neural Population Decoding (2310.16046v1)

Published 24 Oct 2023 in cs.LG and q-bio.NC

Abstract: Our ability to use deep learning approaches to decipher neural activity would likely benefit from greater scale, in terms of both model size and datasets. However, the integration of many neural recordings into one unified model is challenging, as each recording contains the activity of different neurons from different individual animals. In this paper, we introduce a training framework and architecture designed to model the population dynamics of neural activity across diverse, large-scale neural recordings. Our method first tokenizes individual spikes within the dataset to build an efficient representation of neural events that captures the fine temporal structure of neural activity. We then employ cross-attention and a PerceiverIO backbone to further construct a latent tokenization of neural population activities. Utilizing this architecture and training framework, we construct a large-scale multi-session model trained on large datasets from seven nonhuman primates, spanning over 158 different sessions of recording from over 27,373 neural units and over 100 hours of recordings. In a number of different tasks, we demonstrate that our pretrained model can be rapidly adapted to new, unseen sessions with unspecified neuron correspondence, enabling few-shot performance with minimal labels. This work presents a powerful new approach for building deep learning tools to analyze neural data and stakes out a clear path to training at scale.

Citations (22)

View on Semantic Scholar

Summary

The paper introduces POYO, a transformer-based model that decodes neural population activity from multi-session recordings without one-to-one neuron correspondence.
It presents an innovative tokenization approach that represents each spike as a distinct token, preserving temporal dynamics and enhancing efficiency.
The model demonstrates scalability and rapid fine-tuning capabilities, achieving an R2 score of 0.9512 on center-out tasks in multi-session experiments.

A Unified, Scalable Framework for Neural Population Decoding

The paper presents a novel framework for neural population decoding, introducing an architecture that leverages transformer models to analyze large-scale, multi-session neural recordings. The paper addresses a critical challenge in the field: the integration of disparate neural dataset recordings, which often vary significantly in their neuron sets due to the individualized nature of each recording session.

Core Contributions

The primary contributions of this work are summarized as follows:

Large-Scale Multi-Session Model: The authors propose a framework (POYO) that enables the training of large transformer-based models on diverse neural datasets without the need for one-to-one neuron correspondence. This framework is validated on a substantial data corpus comprising 158 recording sessions from seven nonhuman primates.
Innovative Tokenization Approach: A key element of POYO is its unique approach to tokenizing neural population activity. By representing each spike as a distinct token, the model preserves the intrinsic temporal structure of neural events while enhancing computational efficiency with a sparse representation.
Scalability and Adaptability: The paper demonstrates that the pretrained model can be rapidly finetuned to novel sessions and different neural populations, showing robustness in few-shot learning settings where minimal additional labels are required.

Methodological Insights

The proposed framework utilizes a PerceiverIO backbone with cross-attention mechanisms to build a latent representation of neural population dynamics. This model architecture intelligently compresses the otherwise unwieldy neural data into a streamlined pipeline, conducive to efficient processing and fine-grained analysis. The tokenization scheme wherein each spike is embedded with a specific unit identifier and timestamp offers a nuanced method to handle the asynchronous nature of neural spiking data.

Experimental Evaluation

Deep empirical evaluation is performed across data from diverse experimental conditions, encompassing recordings from motor cortical regions during different behavioral tasks. The results underscore the performance gains of the multi-session model as opposed to single-session training, particularly with complex tasks such as random target reaching. An R2 score of 0.9512 on center-out tasks indicates high decoding accuracy, reflecting the efficacy of the proposed model architecture. Furthermore, the model's ability to scale with the number of units and datasets correlates with improved decoding outcomes, exhibiting significance in transfer learning on previously unseen data.

Theoretical Implications and Future Directions

The authors delve into the potential of this approach for bridging gaps between datasets and enhancing our understanding of neural dynamics. By training models on expansive datasets across diverse subjects, a more unified perspective of brain function might be obtained. Fundamental insights into scaling laws underpinning neural decoding are discussed, suggesting pathways to more effective brain-machine interfaces.

Looking forward, the integration of self-supervised learning paradigms could broaden the scope of this research, facilitating the training of neural models on even larger and more varied datasets without explicit behavioral labels. Additionally, understanding the intrinsic organization of learned session and unit embeddings could inform future studies on neural coding and representation.

Overall, this work serves as a robust contribution to the domain of neural decoding, marking a stride towards scalable and unified models that elucidate the complex interplay of neural signals across different experimental conditions. This framework not only sets the stage for future studies but also embodies a resource that can be leveraged by the broader research community for advancing neural data analysis.