Papers
Topics
Authors
Recent
Search
2000 character limit reached

Consciousness Prior Framework

Updated 23 February 2026
  • The Consciousness Prior is a representation learning framework that isolates sparse, high-level abstract factors using an attention bottleneck.
  • It employs a structured sparse factor graph to enhance interpretability and supports reasoning, planning, and natural language mapping.
  • The framework integrates recurrent encoders, attention-based controllers, and language decoders to simulate human-like System 2 cognition.

The Consciousness Prior is a representation learning framework designed to promote the disentanglement of high-level abstract factors through a structural prior inspired by cognitive neuroscience theories of consciousness. It integrates architectural and information-processing constraints, positing that only a sparse subset of high-level variables is selected by attention at any moment to form a low-dimensional "conscious state." This conscious state serves as a bottleneck analogous to the information humans manipulate linguistically in so-called "System 2" cognition. The Consciousness Prior offers a mechanism for organizing and uncovering abstract factors by modeling their joint distribution as a sparse factor graph, providing enhanced interpretability and tractability for reasoning, planning, and natural language mapping (Bengio, 2017).

1. Motivation and Theoretical Foundations

The Consciousness Prior addresses the limitation of current deep learning systems, which predominantly implement "System 1" cognition—fast, intuitive, and highly entangled representations. In contrast, humans routinely engage in "System 2" processes, where concise conscious states involving few abstract concepts are manipulated, reasoned about, and verbalized. The hypothesis underlying this prior is that to learn such disentangled, high-level concepts, the learning system must be biased toward forming low-dimensional conscious states through an attention bottleneck. This corresponds to assuming that the true joint distribution over all abstract factors is well approximated by a sparse factor graph, each factor linking only a handful of variables, resulting in strong, localized "dips" in the overall energy function. Conscious attention is analogized to tractable inference or exploration on this graph (Bengio, 2017).

2. Formal Architecture and Notation

The framework introduces several core modules, each associated with a specific role in the representation learning process:

  • xtXx_t \in X: Raw observation at time tt.
  • ht=F(xt,ht1)h_t = F(x_t, h_{t-1}): The high-dimensional "unconscious" representation, typically generated by a recurrent state encoder.
  • ct=C(ht,ct1,mt1,zt)c_t = C(h_t, c_{t-1}, m_{t-1}, z_t): The low-dimensional conscious state at time tt, computed by a consciousness (attention) RNN CC, with ztz_t as injected noise for stochasticity and mt1m_{t-1} as memory.
  • mt=M(mt1,ct)m_t = M(m_{t-1}, c_t): Updated longer-term memory from conscious state ctc_t.
  • V(ht,ctk)V(h_t, c_{t-k}): A verifier network that scores the match between a previous conscious state and current unconscious representation (e.g., computing logP(statementcurrent world)\log P(\text{statement} \mid \text{current world})).
  • U(ct,ut1)U(c_t, u_{t-1}): An RNN decoder mapping conscious states to natural-language utterances.

These modules collectively instantiate a system in which perception, memory, verification, and language are all mediated through attention-driven conscious state selection (Bengio, 2017).

3. Construction of the Conscious State via Attention Mechanisms

The conscious state ctc_t is derived from the high-dimensional state hth_t using an attention mechanism. The general form is

ct=C(ht,ct1,mt1,zt).c_t = C(h_t, c_{t-1}, m_{t-1}, z_t).

In practical terms, if hth_t is interpreted as a set of vectors {ht(i)}\{h_t^{(i)}\}, attention weights α(i)\alpha^{(i)} are computed:

α(i)=softmaxi(s(ht(i),ct1,mt1))\alpha^{(i)} = \mathrm{softmax}_i\left(s(h_t^{(i)}, c_{t-1}, m_{t-1})\right)

and the conscious state is assembled as

ct=iα(i)ht(i)+(optionally keys or type embeddings).c_t = \sum_i \alpha^{(i)} h_t^{(i)} + \text{(optionally keys or type embeddings)}.

Injected noise ztz_t is used to induce stochasticity, enabling either hard attention (sampling a subset) or regularized exploration through perturbation. This approach ensures that the conscious state remains low-dimensional and focuses on subsets of the latent representation most salient for ongoing processing (Bengio, 2017).

4. Sparse Factor Graphs and Energy-Based Modeling

The Consciousness Prior asserts that the true joint distribution P(S)P(S) over all high-level variables S={V1,...,Vn}S = \{V_1, ..., V_n\} is captured by a sparse factor graph:

P(S)=1Zjfj(Sj),P(S) = \frac{1}{Z} \prod_j f_j(S_j),

with each factor fjf_j depending only on a small subset SjSS_j \subset S, and equivalently in energy-form:

E(S)=jEj(Sj),P(S)exp(E(S)).E(S) = \sum_j E_j(S_j), \quad P(S) \propto \exp\left(-E(S)\right).

The sparseness of this factorization induces the structural prior: meaningful high-level predictions and manipulations can be realized by attending to small, tractable slices of the graph. Attention, therefore, operationalizes inference about the corresponding sub-graphs, focusing computational resources on relevant variables and their dependencies (Bengio, 2017).

5. Integration with Learning Objectives

The Consciousness Prior augments standard learning frameworks by integrating with multiple objectives:

  • Reconstruction losses: Auto-encoding approaches reconstruct xtx_t from hth_t.
  • Prediction and verification: Given ctc_t (selecting BB) and some target AA, maximize logP(AB)\log P(A \mid B) using a decoder or conditional VAE/GAN; the verifier network is trained to approximate

V(ht,ctk)logP(“thought ctk” is trueht).V(h_t, c_{t-k}) \approx \log P(\text{“thought }c_{t-k}\text{” is true} \mid h_t).

  • Reinforcement learning rewards: If ctc_t is used for action selection, back-propagate the RL reward through both FF and CC.
  • Diversity regularization: To prevent trivial or degenerate attention (e.g., focusing solely on easily predicted elements), maximize mutual information or entropy proxies such as

maxI(B;A)ormaximize H[B,A],\max I(B;A) \quad \text{or} \quad \text{maximize } H[B,A],

compelling the attention mechanism to explore various (A,B)(A,B) slices of the factor graph and so cover a diverse set of abstract variables (Bengio, 2017).

6. Mapping Conscious States to Language

The architecture enables a bidirectional mapping between the conscious state ctc_t and a natural language utterance utu_t through an RNN UU:

ut=U(ct,ut1).u_t = U(c_t, u_{t-1}).

The model can be trained via teacher forcing using supervised language data aligned to internal conscious contents, or the process can be inverted to set ctc_t by conditioning on a provided utterance. This procedural symmetry encourages conscious states to align with language-like, discrete, interpretable abstractions (words or short phrases), promoting further disentanglement of representations within the conscious bottleneck (Bengio, 2017).

7. Example Architectures and Training Regimes

Typical realization of the Consciousness Prior uses:

  • Representation RNN FF: LSTM, GRU, or Transformer encoder for modeling x1..tx_{1..t}.
  • Consciousness RNN CC: Small attention network, sampling a subset Kdim(ht)K \ll \dim(h_t) per step.
  • Memory MM: RNN (e.g., LSTM) updating from ctc_t to mtm_t.
  • Verifier VV: Feed-forward or attention-based scorer evaluating (ht,ctk)(h_t, c_{t-k}).
  • Language decoder UU: Seq2seq decoder attending to ctc_t.
  • General training loop:
  1. Observe xtx_t, update ht=F(xt,ht1)h_t = F(x_t, h_{t-1}).
  2. Compute or sample ct=C(ht,ct1,mt1,zt)c_t = C(h_t, c_{t-1}, m_{t-1}, z_t).
  3. Optionally update memory as mt=M(mt1,ct)m_t = M(m_{t-1}, c_t).
  4. Define prediction target AA; select BB in ctc_t.
  5. Compute prediction loss logP(AB)-\log P(A|B) and/or verifier loss V(ht+k,ct)-V(h_{t+k}, c_t).
  6. If language data is available, minimize logP(utct)-\log P(u_t|c_t).
  7. Back-propagate through UU, VV, CC, FF (and via RL if applicable).
  8. Apply regularizers on attention weights to encourage entropy or mutual information.

The paper does not supply explicit pseudocode or large-scale experiments, instead proposing validation on controlled environments (e.g., blocks falling from a table), where high-entropy raw input is complemented by low-entropy, abstract latent variables that the agent is incentivized to discover and manipulate (Bengio, 2017).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Consciousness Prior.