Papers
Topics
Authors
Recent
2000 character limit reached

Fuzzy Modeling in VAD for Emotion Recognition

Updated 14 December 2025
  • Fuzzy Modeling in VAD is a technique that partitions the VAD cube into 27 overlapping fuzzy cuboids using IT2 fuzzy sets to capture uncertainty in self-reported emotion ratings.
  • The approach integrates spatial-temporal EEG features with fuzzy VAD representations through deep architectures, resulting in significant accuracy improvements in emotion classification.
  • Empirical results show that using fuzzy VAD enhances cross-subject generalization by 5–6% compared to crisp VAD baselines, demonstrating its practical benefits in affective computing.

The fuzzy modeling of emotional states in Valence-Arousal-Dominance (VAD) space addresses key limitations in traditional affective computing: the presence of subjective biases and experiment-specific variability in self-reported emotion ratings. By partitioning the crisp VAD cube into soft, overlapping interval type-2 (IT2) fuzzy sets, this approach yields a generic and flexible framework for robust emotion recognition. Deep architectures integrating fuzzy VAD representations with spatial and temporal EEG features provide substantial accuracy improvements and enhanced generalizability across subjects (Asif et al., 15 Jan 2024).

1. Construction of Interval Type-2 Fuzzy Sets in VAD Space

Fuzzy partitioning of each VAD dimension—Valence, Arousal, and Dominance—is achieved by modeling each axis with three linguistic labels (Low, Medium, High), each represented as an IT2 fuzzy set. An IT2 fuzzy set uses two bounded Gaussian membership functions: the Lower Membership Function (LMF) and the Upper Membership Function (UMF), encapsulating the Footprint of Uncertainty (FoU): FoU(Xdim)=[μLMF(Xdim),μUMF(Xdim)]\mathrm{FoU}(X^{\mathrm{dim}}) = [\mu^{\mathrm{LMF}}(X^{\mathrm{dim}}), \mu^{\mathrm{UMF}}(X^{\mathrm{dim}})] The Gaussian parameters (means, variances, cut-offs) for each fuzzy label are empirically specified by the study (see Table I). Each membership degree is defined by (4)–(9) for all dimensions. This “thickens” the partitions and allows soft assignment of points in [1,9]3[1, 9]^3 to one or more fuzzy labels, formalizing inherent uncertainty and subjective variability in ratings.

2. Fuzzy Partitioning of the VAD Cuboid

The original VAD space is the crisp cube [1,9]3[1,9]^3, partitioned along each axis into three overlapping fuzzy bands via the IT2 sets. Cross-over points and supports are governed by the nonzero tails of the respective Gaussians. The resulting space is decomposed into 33=273^3 = 27 fuzzy “cuboids,” each corresponding to a triplet of Low/Med/High status along the V, A, and D axes. In the cuboid lattice model, these 27 fuzzy classes provide auxiliary supervision through a secondary softmax output.

No extra boundary equations are required; overlap regions are implicitly handled by the constructed membership functions.

3. Mapping Self-Reported VAD Scores to Fuzzy Memberships

Given a trial with subject-reported (xV,xA,xD)(x_V, x_A, x_D), each is mapped to its six membership degrees (three from LMF, three from UMF) per axis: Γ=ϕ(xV,xA,xD)[μLowLMF(xV),...,μHighUMF(xD)]\Gamma = \phi(x_V, x_A, x_D) \equiv [\mu_\text{Low}^{\mathrm{LMF}}(x_V), ... , \mu_\text{High}^{\mathrm{UMF}}(x_D)] resulting in an 18-dimensional membership vector. In certain model variants, fuzzy cluster memberships derived from Fuzzy C-Means (FCM) replace direct IT2 memberships, as formalized in equation (14). These fuzzy labels serve as input to the deep recognition framework alongside EEG features.

4. Deep Fuzzy Framework: Architecture and Fusion

4.1 Spatial-Temporal EEG Feature Extraction

EEG signals are represented as stacked Short-Time Fourier Transform (STFT) spectrograms across channels. A spatial module applies two layers of Conv–ReLU–MaxPool (kernel 3×33 \times 3, 32/64 filters, dropout 0.2), then flattens to YflatY_{\mathrm{flat}}.

Temporal dependencies are modeled by repeating YflatY_{\mathrm{flat}} (R=4R=4) and passing the sequence through two stacked LSTM layers ($128$ units, dropout $0.2$). The final hidden state HR[2]H_R^{[2]} aggregates temporal EEG features.

4.2 Fuzzy Module and Variant Models

The fuzzy block consumes ΓR18\Gamma \in \mathbb{R}^{18} via cascade Dense–ReLU–Dropout layers, outputting either:

  • Model 1: 24-way softmax over emotion classes.
  • Model 2: FCM-derived memberships uiju_{ij}, then 24-way softmax.
  • Model 3: Dual-output softmax—$27$ cuboids (p1...p27p_1...p_{27}) and $24$ emotions, trained jointly (equation 16).

4.3 Fusion Strategy

The final feature fusion occurs by concatenating HR[2]H_R^{[2]} (EEG, $128$-dim) and the last dense layer output of the fuzzy block, preceding a joint softmax classification over $24$ emotion classes.

5. Optimization, Training, and Loss Functions

Training employs Adam optimizer (learning rate 10410^{-4}, batch $32$, $100$ epochs, early stopping). Losses are standard cross-entropy for Model 1/2, and additive cross-entropy for dual outputs in Model 3: L=Lemo+λLcub\mathcal{L} = \mathcal{L}_{\mathrm{emo}} + \lambda \mathcal{L}_{\mathrm{cub}} with λ=1\lambda = 1, where: Lemo=i=124yilogy^iLcub=j=127zjlogpj\mathcal{L}_{\mathrm{emo}} = -\sum_{i=1}^{24} y_i \log\hat{y}_i \quad \mathcal{L}_{\mathrm{cub}} = -\sum_{j=1}^{27} z_j \log p_j As the IT2 membership functions are fixed and non-differentiable, gradients propagate through subsequent dense layers only, enabling adaptive weighting of VAD dimensions per subject during learning.

6. Comparative Performance and Ablation Analysis

Empirical evaluation on the DENS dataset uses 7-second windows and subject-reported VAD scores. The tested models yield 24-way emotion-recognition accuracy as follows:

Model Accuracy (%) Approach
IT2 Type-2 MF (Model-1) 96.09 Interval Type-2 fuzzy sets
Cuboid Lattice (Model-3) 95.75 27-way fuzzy cuboids
FCM clusters (Model-2) 95.31 Fuzzy C-Means clustering

Cross-subject generalization accuracy improves with fuzzy modeling by 5–6%; for example, Groups 1 vs 2 attains 78.35% with fuzzy VAD versus 72.97% without. Single-subject ablation studies indicate lower performance for the crisp VAD baseline (95.01%) and omission of VAD input (93.54%), while exclusive use of UMF or LMF (95.82%, 94.65%) is outperformed by the full IT2 construction.

7. Implications and Application Domains

The generic nature of IT2 fuzzy VAD representations enables robust emotion modeling under subjectivity and inter-experimental variability. Joint fuzzy-EEG fusion facilitates improved accuracy and consistent cross-subject transfer, offering practical advantages in affective computing, human-computer interaction, and mental health monitoring. A plausible implication is that uncertainty-enriched emotion modeling extends to contexts where annotation reliability is variable or cross-population adaptation is crucial.

Real-world deployment is supported by empirically robust classification, modular architectural components, and generalizable fuzzy cuboid mappings. The methodology advances the interpretability and stability of emotion recognition in neurocognitive interfaces (Asif et al., 15 Jan 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Fuzzy Modeling in VAD.