Papers
Topics
Authors
Recent
Search
2000 character limit reached

Emotion-Sensitive Neurons in Neural Models

Updated 13 January 2026
  • Emotion-Sensitive Neurons (ESNs) are defined as neural units displaying selective activation correlated with specific emotion classes, as shown by targeted ablation and steering experiments.
  • They are identified using metrics like Activation Probability, Mean Activation Difference, and Contrastive Activation Selection to ensure precise detection across models.
  • Their manipulation in audio-language and cross-domain frameworks offers actionable insights for interpretability, control, and safety in affective AI systems.

Emotion-Sensitive Neurons (ESNs) are individual neural units whose activation is selectively correlated with particular emotion classes and whose manipulation via causal interventions yields emotion-specific effects on model predictions. ESNs have been formalized in both large audio-LLMs (LALMs) and cross-domain frameworks such as SHArE, where they encode appraisals of core values (e.g., valence, arousal) and participate directly in the ultra-fast emotional judgments essential to both human and artificial cognition (Zhao et al., 6 Jan 2026, Opong-Mensah, 2020).

1. Formal Definition and Mechanistic Role

In LALMs, ESNs are specified as decoder SwiGLU-MLP gate units with positive activations that exhibit selective, emotion-conditioned firing patterns. For an emotion ee, an ESN is characterized by an activation profile {al,n,t}t\{a_{l,n,t}\}_{t} such that the probability

LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}

is elevated for one emotion compared to others, where Kl,n(e)K^{(e)}_{l,n} counts positive firings and TeT_e is the set size for evaluation. Ablations of these neurons selectively degrade recognition of their associated emotions (“self-deactivation”) far more than for other classes (“cross-deactivation”). Conversely, gain amplification steers outputs toward the target emotion (“steering”). This demonstrates their causal necessity and partial sufficiency for affective decisions in speech-to-text inference (Zhao et al., 6 Jan 2026).

In the SHArE framework, ESNs are dynamical units whose membrane voltage dynamics encode not just stimulus classification but valenced appraisal: γi,c=ηi,cΔsi\gamma_{i,c} = \eta_{i,c}\,\Delta s_i with valence sensitivity ηi,c\eta_{i,c} and perception depth Δsi\Delta s_i. These parameters endow the ESN with the ability to compute and transmit emotional judgments both at the connectionist and conductance-based simulation levels (Opong-Mensah, 2020).

2. ESN Identification: Metrics and Algorithms

ESN selection in LALMs relies on four primary neuron-selector metrics:

  • Activation Probability (LAP): Frequency of positive activation for an emotion.
  • Activation Probability Entropy (LAPE): Entropy of emotion-conditioned activation probabilities, quantifying selectivity.
  • Mean Activation Difference (MAD): Difference between mean activation for ee and all other emotions.

MADl,n(e)=Ml,n(e)Mˉl,n(e)\mathrm{MAD}^{(e)}_{l,n} = M^{(e)}_{l,n} - \bar{M}^{(-e)}_{l,n}

  • Contrastive Activation Selection (CAS): Margin between top and runner-up emotion activation probabilities, yielding ESNs via

{al,n,t}t\{a_{l,n,t}\}_{t}0

Top {al,n,t}t\{a_{l,n,t}\}_{t}1 neurons by metric score are selected per emotion (Zhao et al., 6 Jan 2026).

In SHArE, identification is formalized via the assignment to each ANN neuron of the parameters {al,n,t}t\{a_{l,n,t}\}_{t}2, thereby rendering each unit an ESN{al,n,t}t\{a_{l,n,t}\}_{t}3 with explicit valence/arousal sensitivity and correlation coefficients embedded in its activation dynamics (Opong-Mensah, 2020).

3. Inference-Time Intervention and Causal Validation

LALM ESNs are validated by three key interventions:

  • Ablation/Deactivation: Setting all ESNs for emotion {al,n,t}t\{a_{l,n,t}\}_{t}4 to zero via binary mask

{al,n,t}t\{a_{l,n,t}\}_{t}5

causes targeted performance drops for {al,n,t}t\{a_{l,n,t}\}_{t}6 (up to −14.63% accuracy change), with minimal impact on other classes.

  • Targeted Steering (Gain Amplification): Scaling ESN outputs by {al,n,t}t\{a_{l,n,t}\}_{t}7

{al,n,t}t\{a_{l,n,t}\}_{t}8

increases {al,n,t}t\{a_{l,n,t}\}_{t}9 recognition by +2.7–3.3 points, demonstrating sufficiency of selected neurons (Zhao et al., 6 Jan 2026).

  • Agnostic Injection: Non-specific gain amplification using multiple emotion masks (methods: 2-Pass, Mix, Union), yielding less specificity but quantifying cross-emotion circuit interactions.

In SHArE, causal interpretability extends to conductance-based models: synaptic conductance LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}0 is modulated by valence and perception (LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}1), and spike events trigger judgment functions LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}2, mapping neural voltage trajectories to emotional state updates (Opong-Mensah, 2020).

4. Layer-Wise Distribution and Transfer Properties

ESNs in Qwen2.5-Omni, Kimi-Audio, and Audio Flamingo 3 cluster non-uniformly across the decoder, with maxima in layers 0, 6–8, and 19–22, and sparse representation in central layers. This robustly replicates across architectures and emotionally annotated datasets.

Cross-dataset transfer analyses show that ESNs identified on one corpus yield diagonal self-effect signatures when ablated on another (shared classes only), albeit with reduced magnitude. Emotional specificity is asymmetric: “anger” and “sadness” consistently transfer more strongly than “neutral,” suggesting a shared mechanistic encoding with some dataset-dependent adaptation (Zhao et al., 6 Jan 2026).

5. Experimental Paradigms and Benchmarks

Empirical validation follows a strict protocol:

  • Model Platforms: Qwen2.5-Omni-7B, Kimi-Audio-7B, Audio Flamingo 3 (28-layer decoders, speech input, text output).
  • Benchmark Datasets: IEMOCAP, MELD, MSP-Podcast, five-way emotion labels (anger, joy/happiness, neutral, sadness, frustration/surprise).
  • Data Pools: Identification performed on correctly answered items per-model/emotion (min 200, max 1000), evaluation on balanced held-out sets.
  • Prompting: Multiple-choice speech emotion recognition (SER), randomized map, greedy decoding. Performance statistics confirm highly emotion-selective necessity and sufficiency across methods and datasets, with statistical robustness evidenced by consistent effects across three models (Zhao et al., 6 Jan 2026).

6. Biological and Artificial Generalization: SHArE Perspectives

The SHArE framework generalizes ESNs beyond digital LALMs, embedding them into biological simulations and abstract policy networks:

  • Biological Simulation: ESNs mapped to real neurons via conductance-based models, with membrane voltage LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}3 and real-time valence proxies LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}4.
  • Artificial Networks: ANN units treated as ESNs by augmenting with LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}5, enabling sentiment manifolds and emotion-region clustering in latent activation space.
  • Therapy and Machine Motivation: Gradients LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}6 enable trajectory design to steer patient ESN states (behavioral intervention) or imbue artificial agents with synthetic motivational drives analogous to human emotional processes (“hunger,” “sociality”) (Opong-Mensah, 2020).

7. Implications for Interpretability, Control, and Safety

The existence of compact ESN sets delivers a mechanistic account of affective computation at the neuron level in LALMs and broader neural frameworks. Targeted ESN interventions provide actionable control handles:

  • Interpretability: ESNs elucidate the internal routing of paralinguistic features through selective modulating subspaces (SwiGLU gates).
  • Controllability: Gain amplification can steer agent behavior or conversational tone (e.g., increasing “empathy” by amplifying “sadness” circuits).
  • Safety: Understanding and modulating ESNs presents a pathway to mitigate undesired tone, affective bias, and to monitor ethical alignment in emotional AI agents. A plausible implication is that refined ESN-driven interventions could underpin next-generation affectively capable and transparent conversational systems, as well as advancing neuroscientific modeling of emotion at the single-neuron granularity (Zhao et al., 6 Jan 2026, Opong-Mensah, 2020).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Emotion-Sensitive Neurons (ESNs).