Consciousness Prior Framework
- The Consciousness Prior is a representation learning framework that isolates sparse, high-level abstract factors using an attention bottleneck.
- It employs a structured sparse factor graph to enhance interpretability and supports reasoning, planning, and natural language mapping.
- The framework integrates recurrent encoders, attention-based controllers, and language decoders to simulate human-like System 2 cognition.
The Consciousness Prior is a representation learning framework designed to promote the disentanglement of high-level abstract factors through a structural prior inspired by cognitive neuroscience theories of consciousness. It integrates architectural and information-processing constraints, positing that only a sparse subset of high-level variables is selected by attention at any moment to form a low-dimensional "conscious state." This conscious state serves as a bottleneck analogous to the information humans manipulate linguistically in so-called "System 2" cognition. The Consciousness Prior offers a mechanism for organizing and uncovering abstract factors by modeling their joint distribution as a sparse factor graph, providing enhanced interpretability and tractability for reasoning, planning, and natural language mapping (Bengio, 2017).
1. Motivation and Theoretical Foundations
The Consciousness Prior addresses the limitation of current deep learning systems, which predominantly implement "System 1" cognition—fast, intuitive, and highly entangled representations. In contrast, humans routinely engage in "System 2" processes, where concise conscious states involving few abstract concepts are manipulated, reasoned about, and verbalized. The hypothesis underlying this prior is that to learn such disentangled, high-level concepts, the learning system must be biased toward forming low-dimensional conscious states through an attention bottleneck. This corresponds to assuming that the true joint distribution over all abstract factors is well approximated by a sparse factor graph, each factor linking only a handful of variables, resulting in strong, localized "dips" in the overall energy function. Conscious attention is analogized to tractable inference or exploration on this graph (Bengio, 2017).
2. Formal Architecture and Notation
The framework introduces several core modules, each associated with a specific role in the representation learning process:
- : Raw observation at time .
- : The high-dimensional "unconscious" representation, typically generated by a recurrent state encoder.
- : The low-dimensional conscious state at time , computed by a consciousness (attention) RNN , with as injected noise for stochasticity and as memory.
- : Updated longer-term memory from conscious state .
- : A verifier network that scores the match between a previous conscious state and current unconscious representation (e.g., computing ).
- : An RNN decoder mapping conscious states to natural-language utterances.
These modules collectively instantiate a system in which perception, memory, verification, and language are all mediated through attention-driven conscious state selection (Bengio, 2017).
3. Construction of the Conscious State via Attention Mechanisms
The conscious state is derived from the high-dimensional state using an attention mechanism. The general form is
In practical terms, if is interpreted as a set of vectors , attention weights are computed:
and the conscious state is assembled as
Injected noise is used to induce stochasticity, enabling either hard attention (sampling a subset) or regularized exploration through perturbation. This approach ensures that the conscious state remains low-dimensional and focuses on subsets of the latent representation most salient for ongoing processing (Bengio, 2017).
4. Sparse Factor Graphs and Energy-Based Modeling
The Consciousness Prior asserts that the true joint distribution over all high-level variables is captured by a sparse factor graph:
with each factor depending only on a small subset , and equivalently in energy-form:
The sparseness of this factorization induces the structural prior: meaningful high-level predictions and manipulations can be realized by attending to small, tractable slices of the graph. Attention, therefore, operationalizes inference about the corresponding sub-graphs, focusing computational resources on relevant variables and their dependencies (Bengio, 2017).
5. Integration with Learning Objectives
The Consciousness Prior augments standard learning frameworks by integrating with multiple objectives:
- Reconstruction losses: Auto-encoding approaches reconstruct from .
- Prediction and verification: Given (selecting ) and some target , maximize using a decoder or conditional VAE/GAN; the verifier network is trained to approximate
- Reinforcement learning rewards: If is used for action selection, back-propagate the RL reward through both and .
- Diversity regularization: To prevent trivial or degenerate attention (e.g., focusing solely on easily predicted elements), maximize mutual information or entropy proxies such as
compelling the attention mechanism to explore various slices of the factor graph and so cover a diverse set of abstract variables (Bengio, 2017).
6. Mapping Conscious States to Language
The architecture enables a bidirectional mapping between the conscious state and a natural language utterance through an RNN :
The model can be trained via teacher forcing using supervised language data aligned to internal conscious contents, or the process can be inverted to set by conditioning on a provided utterance. This procedural symmetry encourages conscious states to align with language-like, discrete, interpretable abstractions (words or short phrases), promoting further disentanglement of representations within the conscious bottleneck (Bengio, 2017).
7. Example Architectures and Training Regimes
Typical realization of the Consciousness Prior uses:
- Representation RNN : LSTM, GRU, or Transformer encoder for modeling .
- Consciousness RNN : Small attention network, sampling a subset per step.
- Memory : RNN (e.g., LSTM) updating from to .
- Verifier : Feed-forward or attention-based scorer evaluating .
- Language decoder : Seq2seq decoder attending to .
- General training loop:
- Observe , update .
- Compute or sample .
- Optionally update memory as .
- Define prediction target ; select in .
- Compute prediction loss and/or verifier loss .
- If language data is available, minimize .
- Back-propagate through , , , (and via RL if applicable).
- Apply regularizers on attention weights to encourage entropy or mutual information.
The paper does not supply explicit pseudocode or large-scale experiments, instead proposing validation on controlled environments (e.g., blocks falling from a table), where high-entropy raw input is complemented by low-entropy, abstract latent variables that the agent is incentivized to discover and manipulate (Bengio, 2017).