Papers
Topics
Authors
Recent
2000 character limit reached

Emotion-Sensitive Neurons in Neural Models

Updated 13 January 2026
  • Emotion-Sensitive Neurons (ESNs) are defined as neural units displaying selective activation correlated with specific emotion classes, as shown by targeted ablation and steering experiments.
  • They are identified using metrics like Activation Probability, Mean Activation Difference, and Contrastive Activation Selection to ensure precise detection across models.
  • Their manipulation in audio-language and cross-domain frameworks offers actionable insights for interpretability, control, and safety in affective AI systems.

Emotion-Sensitive Neurons (ESNs) are individual neural units whose activation is selectively correlated with particular emotion classes and whose manipulation via causal interventions yields emotion-specific effects on model predictions. ESNs have been formalized in both large audio-LLMs (@@@@3@@@@) and cross-domain frameworks such as SHArE, where they encode appraisals of core values (e.g., valence, arousal) and participate directly in the ultra-fast emotional judgments essential to both human and artificial cognition (Zhao et al., 6 Jan 2026, Opong-Mensah, 2020).

1. Formal Definition and Mechanistic Role

In LALMs, ESNs are specified as decoder SwiGLU-MLP gate units with positive activations that exhibit selective, emotion-conditioned firing patterns. For an emotion ee, an ESN is characterized by an activation profile {al,n,t}t\{a_{l,n,t}\}_{t} such that the probability

LAPl,n(e)=Pl,n(e)=Kl,n(e)Te\mathrm{LAP}^{(e)}_{l,n} = P^{(e)}_{l,n} = \frac{K^{(e)}_{l,n}}{T_e}

is elevated for one emotion compared to others, where Kl,n(e)K^{(e)}_{l,n} counts positive firings and TeT_e is the set size for evaluation. Ablations of these neurons selectively degrade recognition of their associated emotions (“self-deactivation”) far more than for other classes (“cross-deactivation”). Conversely, gain amplification steers outputs toward the target emotion (“steering”). This demonstrates their causal necessity and partial sufficiency for affective decisions in speech-to-text inference (Zhao et al., 6 Jan 2026).

In the SHArE framework, ESNs are dynamical units whose membrane voltage dynamics encode not just stimulus classification but valenced appraisal: γi,c=ηi,cΔsi\gamma_{i,c} = \eta_{i,c}\,\Delta s_i with valence sensitivity ηi,c\eta_{i,c} and perception depth Δsi\Delta s_i. These parameters endow the ESN with the ability to compute and transmit emotional judgments both at the connectionist and conductance-based simulation levels (Opong-Mensah, 2020).

2. ESN Identification: Metrics and Algorithms

ESN selection in LALMs relies on four primary neuron-selector metrics:

  • Activation Probability (LAP): Frequency of positive activation for an emotion.
  • Activation Probability Entropy (LAPE): Entropy of emotion-conditioned activation probabilities, quantifying selectivity.
  • Mean Activation Difference (MAD): Difference between mean activation for ee and all other emotions.

MADl,n(e)=Ml,n(e)Mˉl,n(e)\mathrm{MAD}^{(e)}_{l,n} = M^{(e)}_{l,n} - \bar{M}^{(-e)}_{l,n}

  • Contrastive Activation Selection (CAS): Margin between top and runner-up emotion activation probabilities, yielding ESNs via

sl,nCAS(e)={Pl,n(1)Pl,n(2),if e=el,n(1) ,otherwises_{l,n}^{\mathrm{CAS}(e)} = \begin{cases} P^{(1)}_{l,n} - P^{(2)}_{l,n}, & \text{if } e = e^{(1)}_{l,n} \ -\infty, & \text{otherwise} \end{cases}

Top r%r\% neurons by metric score are selected per emotion (Zhao et al., 6 Jan 2026).

In SHArE, identification is formalized via the assignment to each ANN neuron of the parameters {ηj,c,Δsj,ρk,j}\{\eta_{j,c}, \Delta s_j, \rho_{k,j}\}, thereby rendering each unit an ESNj(n)^{(n)}_j with explicit valence/arousal sensitivity and correlation coefficients embedded in its activation dynamics (Opong-Mensah, 2020).

3. Inference-Time Intervention and Causal Validation

LALM ESNs are validated by three key interventions:

  • Ablation/Deactivation: Setting all ESNs for emotion esrce_{\text{src}} to zero via binary mask

g~l,tabl=gl,trl(m,esrc)\tilde g^{\rm abl}_{l,t} = g_{l,t} \odot r^{(m,e_{\rm src})}_l

causes targeted performance drops for esrce_{\text{src}} (up to −14.63% accuracy change), with minimal impact on other classes.

  • Targeted Steering (Gain Amplification): Scaling ESN outputs by 1+α1+\alpha

g~l,tsteer=gl,tsl(m,esrc)(α)\tilde g^{\rm steer}_{l,t} = g_{l,t} \odot s^{(m,e_{\rm src})}_l(\alpha)

increases esrce_{\text{src}} recognition by +2.7–3.3 points, demonstrating sufficiency of selected neurons (Zhao et al., 6 Jan 2026).

  • Agnostic Injection: Non-specific gain amplification using multiple emotion masks (methods: 2-Pass, Mix, Union), yielding less specificity but quantifying cross-emotion circuit interactions.

In SHArE, causal interpretability extends to conductance-based models: synaptic conductance gij(t)g_{ij}(t) is modulated by valence and perception (ηj,cΔsj\eta_{j,c}\Delta s_j), and spike events trigger judgment functions DD, mapping neural voltage trajectories to emotional state updates (Opong-Mensah, 2020).

4. Layer-Wise Distribution and Transfer Properties

ESNs in Qwen2.5-Omni, Kimi-Audio, and Audio Flamingo 3 cluster non-uniformly across the decoder, with maxima in layers 0, 6–8, and 19–22, and sparse representation in central layers. This robustly replicates across architectures and emotionally annotated datasets.

Cross-dataset transfer analyses show that ESNs identified on one corpus yield diagonal self-effect signatures when ablated on another (shared classes only), albeit with reduced magnitude. Emotional specificity is asymmetric: “anger” and “sadness” consistently transfer more strongly than “neutral,” suggesting a shared mechanistic encoding with some dataset-dependent adaptation (Zhao et al., 6 Jan 2026).

5. Experimental Paradigms and Benchmarks

Empirical validation follows a strict protocol:

  • Model Platforms: Qwen2.5-Omni-7B, Kimi-Audio-7B, Audio Flamingo 3 (28-layer decoders, speech input, text output).
  • Benchmark Datasets: IEMOCAP, MELD, MSP-Podcast, five-way emotion labels (anger, joy/happiness, neutral, sadness, frustration/surprise).
  • Data Pools: Identification performed on correctly answered items per-model/emotion (min 200, max 1000), evaluation on balanced held-out sets.
  • Prompting: Multiple-choice speech emotion recognition (SER), randomized map, greedy decoding. Performance statistics confirm highly emotion-selective necessity and sufficiency across methods and datasets, with statistical robustness evidenced by consistent effects across three models (Zhao et al., 6 Jan 2026).

6. Biological and Artificial Generalization: SHArE Perspectives

The SHArE framework generalizes ESNs beyond digital LALMs, embedding them into biological simulations and abstract policy networks:

  • Biological Simulation: ESNs mapped to real neurons via conductance-based models, with membrane voltage Vm,i(t)V_{m,i}(t) and real-time valence proxies ηi,cΔsi\eta_{i,c}\Delta s_i.
  • Artificial Networks: ANN units treated as ESNs by augmenting with {η,Δs,ρ}\{\eta, \Delta s, \rho\}, enabling sentiment manifolds and emotion-region clustering in latent activation space.
  • Therapy and Machine Motivation: Gradients α/a~\partial \underline{\alpha}/\partial \tilde a enable trajectory design to steer patient ESN states (behavioral intervention) or imbue artificial agents with synthetic motivational drives analogous to human emotional processes (“hunger,” “sociality”) (Opong-Mensah, 2020).

7. Implications for Interpretability, Control, and Safety

The existence of compact ESN sets delivers a mechanistic account of affective computation at the neuron level in LALMs and broader neural frameworks. Targeted ESN interventions provide actionable control handles:

  • Interpretability: ESNs elucidate the internal routing of paralinguistic features through selective modulating subspaces (SwiGLU gates).
  • Controllability: Gain amplification can steer agent behavior or conversational tone (e.g., increasing “empathy” by amplifying “sadness” circuits).
  • Safety: Understanding and modulating ESNs presents a pathway to mitigate undesired tone, affective bias, and to monitor ethical alignment in emotional AI agents. A plausible implication is that refined ESN-driven interventions could underpin next-generation affectively capable and transparent conversational systems, as well as advancing neuroscientific modeling of emotion at the single-neuron granularity (Zhao et al., 6 Jan 2026, Opong-Mensah, 2020).
Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Emotion-Sensitive Neurons (ESNs).