Reasoning vs surface-cue detection in emotion understanding models
Determine whether natural language processing models for emotion understanding genuinely perform reasoning about emotional states in textual inputs or primarily detect surface-level affective cues during multi-label emotion classification.
References
As a result, it remains unclear whether models genuinely reason about emotional states or merely detect surface-level affective cues.
— Emotion Entanglement and Bayesian Inference for Multi-Dimensional Emotion Understanding
(2604.00819 - Kotaprolu et al., 1 Apr 2026) in Related Work — Limitations of existing benchmarks and our approach