Towards a "universal translator" for neural dynamics at single-cell, single-spike resolution (2407.14668v2)

Published 19 Jul 2024 in q-bio.NC, cs.LG, and cs.NE

Abstract: Neuroscience research has made immense progress over the last decade, but our understanding of the brain remains fragmented and piecemeal: the dream of probing an arbitrary brain region and automatically reading out the information encoded in its neural activity remains out of reach. In this work, we build towards a first foundation model for neural spiking data that can solve a diverse set of tasks across multiple brain areas. We introduce a novel self-supervised modeling approach for population activity in which the model alternates between masking out and reconstructing neural activity across different time steps, neurons, and brain regions. To evaluate our approach, we design unsupervised and supervised prediction tasks using the International Brain Laboratory repeated site dataset, which is comprised of Neuropixels recordings targeting the same brain locations across 48 animals and experimental sessions. The prediction tasks include single-neuron and region-level activity prediction, forward prediction, and behavior decoding. We demonstrate that our multi-task-masking (MtM) approach significantly improves the performance of current state-of-the-art population models and enables multi-task learning. We also show that by training on multiple animals, we can improve the generalization ability of the model to unseen animals, paving the way for a foundation model of the brain at single-cell, single-spike resolution.

Citations (3)

View on Semantic Scholar

Summary

The paper introduces a self-supervised multi-session model using a novel mixture-of-masks strategy to enhance neural decoding across diverse brain regions.
It demonstrates significant performance gains over traditional baselines by leveraging multi-animal datasets and comprehensive masking techniques.
The approach paves the way for scalable, real-time brain-machine interfaces and potential clinical diagnostic tools.

Towards a “Universal Translator” for Neural Dynamics at Single-Cell, Single-Spike Resolution

The paper "Towards a 'universal translator' for neural dynamics at single-cell, single-spike resolution," authored by researchers from various prestigious institutions, presents a novel self-supervised modeling approach for neural population activity. The ultimate goal is to develop a foundation model capable of solving diverse tasks across multiple brain regions.

Summary of Contributions

The primary contributions of this paper are centered around the introduction of a self-supervised learning framework that utilizes a mixture-of-masks approach to achieve multi-task learning. This approach involves masking and reconstructing neural activity across different time steps, neurons, and brain regions, thereby allowing the model to capture complex dependencies both temporally and spatially.

Novel Masking Approach: The paper introduces a multi-task masking strategy that can be applied to multi-region datasets. The aim is to learn richer representations that can improve downstream decoding performance.
Self-Supervised Multi-Session Model: This is the first demonstration of such a model applied to multi-region neural recordings. The model is trained across different sessions and animals, making it more robust and capable of generalizing to unseen data.
Scalability: The authors demonstrate that increasing the amount of data from multiple animals enhances the model's performance on held-out sessions. This suggests that larger, diverse datasets contribute to better generalization.
Prompting for Test-Time Adaptation and Ensembling: They introduce prompting tokens which enhance the model's adaptability during test phases and improve decoding and few-shot performance.

Methodological Advancements

The core innovation lies in the self-supervised learning framework which focuses on masking and reconstruction tasks:

Temporal Masking: This includes both random and causal temporal masking. Random masking predicts randomly selected time bins using surrounding context, whereas causal masking predicts future bins using past activity.
Neuron Masking: This involves several schemes:
- Random Neuron Masking: Randomly selected neurons are masked, aiding the model to learn spatiotemporal dependencies.
- Intra-region Neuron Masking: Masks neurons within specific regions to capture intra-area dynamics.
- Inter-region Neuron Masking: Selects entire brain regions for masking to understand inter-region interactions.

Evaluation and Results

The model’s evaluation is based on unsupervised prediction tasks using the International Brain Laboratory dataset, which comprises Neuropixels recordings from multiple animals and sessions. Specific tasks include:

Single-Neuron and Region-Level Activity Prediction
Forward Prediction: Anticipating future neural activity from past observations.
Behavior Decoding: Inferring behavior from neural signals.

Empirical results indicate that the mixture-of-masks approach significantly enhances performance over state-of-the-art population models. By training on data from multiple animals, the model exhibits improved generalization capabilities, making substantial strides towards the establishment of a consistent foundation model for neural data.

Baseline Comparisons

The researchers compare their model against several baselines:

Zero-th order and first-order linear predictions.
Smoothed zero-th order predictions.
PCA denoising.

Each baseline serves to underscore the added value of the self-supervised masking strategies in capturing the nuanced structure of neural activity.

Implications and Future Directions

The theoretical and practical implications of this work are substantial. Theoretically, the approach introduces a robust framework for understanding and modeling neural dynamics across multiple brain regions and individual subjects. Practically, this work paves the way for developing models that aid in brain-machine interfaces and potentially in diagnostic tools for neural disorders.

Future developments in AI, spurred by this research, could involve:

More Comprehensive Models: Integrating additional brain regions or neural types.
Real-Time Applications: Extending the approach for real-time neural decoding, beneficial for neuroprosthetics.
Generalization to Human Data: Translating these methods to human neural recordings and exploring their implications in cognitive neuroscience and clinical settings.

In summary, this paper presents a comprehensive framework for building scalable, generalized models of neural activity. Through intricate masking and reconstruction strategies, it establishes a promising direction for future research in neural decoding and brain-machine interfaces.

PDF Markdown

Related Papers

Tweets

https://twitter.com/cole_hurwitz/status/1816155243911971273

https://twitter.com/koval_alvi/status/1821488567513739438