Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 182 tok/s Pro
GPT OSS 120B 458 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

CBCP EEG Dataset for Childhood Empathy

Updated 15 September 2025
  • CBCP dataset is an EEG-based collection capturing multi-channel recordings from 57 children to quantify empathy responses.
  • It employs a standardized film viewing and post-assessment protocol to generate binary empathy labels, ensuring reproducible model evaluation.
  • The dataset supports advanced deep learning through multi-view fusion of cognitive and emotional EEG signals within the BEAM framework.

The CBCP dataset is an EEG-based collection used for empirical validation in empathy research for early childhood, specifically designed to facilitate objective and quantitative assessment of young children's empathetic responses. In the context of the BEAM (Brainwave Empathy Assessment Model) framework (Xie et al., 8 Sep 2025), the CBCP dataset offers high-resolution, multi-channel EEG recordings associated with validated behavioral empathy assessments, supporting advanced deep learning approaches in developmental neuroscience and affective computing.

1. Dataset Composition and Acquisition Protocol

The CBCP dataset consists of EEG signals from 57 typically developing children, aged 4 to 6 years (mean 4.91 ± 1.07 years). During data acquisition, each child participated in a standardized 6-minute viewing session of the Pixar short film “Partly Cloudy.” Immediately following the film, each child was subjected to a post-test assessment, quantifying their behavioral empathy through a willingness-to-help rating in a negative scenario.

EEG data were systematically collected with multi-channel equipment, yielding multi-view time series for each subject. Labels were generated by median-splitting the post-test empathy scores into binary classes—high and low empathy. The dataset segmentation followed a subject-level split: 70% for training, 20% for validation, and 10% for testing, repeated across five random seeds to ensure statistical robustness.

Attribute Value Description
N 57 Number of children (subjects)
Age 4.91 ± 1.07 years Mean ± SD
Data Type Multi-channel EEG (multi-view) Spatio-temporal signals per subject
Session 6 min, “Partly Cloudy” + post-assessment Standardized empathy elicitation protocol
Labels Binary (high/low empathy via median split) Derived from behavioral willingness-to-help rating

The subject-level splitting and repeated randomization mitigate sampling bias and support generalizable model evaluation.

2. Experimental Design and Evaluation Criteria

Each EEG sample from the CBCP dataset is paired with behaviorally grounded empathy labels. The design leverages both cognitive (Theory-of-Mind, ToM) and emotional (EM) EEG views, facilitating multi-modal analysis through neural architectures. The evaluation protocol used in BEAM ensures rigorous assessment via stratified experiments, and metrics are reported as mean ± standard deviation across repetitions.

Three classification metrics are systematically employed:

  • Accuracy: Proportion of correctly classified subjects.
  • Specificity: True negative rate (correct classification of low empathy).
  • Sensitivity: True positive rate (correct classification of high empathy).

Performance statistics for BEAM (Proposed Method) were: accuracy 0.647 ± 0.008, specificity 0.651 ± 0.009, sensitivity 0.646 ± 0.009. These results establish a quantitative benchmark for future models trained and evaluated on CBCP.

3. Feature Representation and Multi-View Structure

The CBCP dataset’s EEG records are formatted as XRC×WX \in \mathbb{R}^{C \times W}, where CC is the number of EEG channels and WW is the temporal window length. The multi-view paradigm is explicit:

  • Z_ToM: Encodes the cognitive Theory-of-Mind features.
  • Z_EM: Encodes the emotional features.

The dataset structure allows for extraction and fusion of these distinct representations:

Zn=[Com(Zn),Sep(Zn)]Z_n = [\text{Com}(Z_n), \text{Sep}(Z_n)]

Each ZnZ_n (for n{ToM,EM}n \in \{\text{ToM}, \text{EM}\}) is decomposed into common and separate latent components, aligned with advanced feature fusion methods in BEAM. This structure supports both joint and differential analysis of cognitive versus emotional EEG signal patterns.

4. Usage in Deep Learning Empathy Models

CBCP’s design supports models requiring objective, high-dimensional neural time series coupled with reliable outcome labels. It enables training deep frameworks such as BEAM, which utilizes:

  • LaBraM-based Encoder: Transformer model for spatio-temporal EEG patch embedding.
  • Feature Fusion: Latent decomposition and similarity-based fusion loss:

LFusion=SimSepSimCom+1+εL_{\text{Fusion}} = \frac{|\text{Sim}_\text{Sep}|}{\text{Sim}_\text{Com} + 1 + \varepsilon}

Where:

  • SimCom\text{Sim}_\text{Com} quantifies cosine similarity of common parts across views,
  • SimSep\text{Sim}_\text{Sep} quantifies cosine similarity of separate parts.
  • Contrastive Learning: The InfoNCE loss (LContraL_{\text{Contra}}) is employed to enforce discriminative latent representations for empathy classification:

LContra=1Bi=1Blog(exp((zizi+)/τ)j=1Bexp((zizj)/τ))L_{\text{Contra}} = -\frac{1}{B} \sum_{i=1}^{B} \log \left( \frac{ \exp((z_i \cdot z_i^+)/\tau) }{ \sum_{j=1}^B \exp((z_i \cdot z_j)/\tau) } \right)

where BB is batch size, τ\tau temperature parameter, ziz_i and zi+z_i^+ are latent representations for anchor and positive samples, respectively.

CBCP thus provides sufficient sample diversity, granularity, and label reliability for data-hungry deep learning models targeting neurodevelopmental empathy assessment.

5. Benchmarking, Performance Results, and Model Comparison

The CBCP dataset constitutes the empirical testbed for evaluating methods in childhood empathy prediction. In the case of BEAM, comparison with state-of-the-art alternatives—BIOT, ST-Transformer, and SVM-asymmetry—showed consistent and superior performance:

Model Accuracy Specificity Sensitivity
BEAM (Proposed) 0.647 ± 0.008 0.651 ± 0.009 0.646 ± 0.009
BIOT < 0.647 < 0.651 < 0.646
ST-Transformer < 0.647 < 0.651 < 0.646

The lower standard deviation for BEAM’s metrics suggests increased robustness to variations in subject-level splitting. This suggests that the CBCP dataset supports reliable benchmarking for temporal deep learning models in neurobehavioral assessment scenarios.

6. Scientific Impact and Plausible Implications

CBCP’s design and application facilitate objective, scalable assessment of empathy in early childhood, overcoming limitations of self-report and observer-only methodologies. Its integration with multimodal EEG and advanced neural models enables nuanced evaluation of both cognitive and emotional empathy processes.

A plausible implication is that CBCP, if extended to larger and more heterogeneous samples, could underpin the development of predictive neurobehavioral diagnostics or personalized intervention strategies for prosocial development in children. The robust feature extraction and multi-view fusion paradigms it supports are instrumental for future studies aiming to disentangle the neurophysiological substrates of empathy and related affective traits.

7. Limitations and Directions for Future Research

CBCP, in its current instantiation, includes 57 subjects with binary empathy labels derived from a specific post-test paradigm. While effective for benchmarking, any expansion in sample size, age range, or time/event segmentation may further enhance model generalizability. The dataset’s utility for transfer learning, domain adaptation, or finer-grained behavioral stratification remains an area for future exploration. Moreover, its exclusive focus on EEG during passive film viewing may constrain some aspects of ecological validity, suggesting the benefit of supplementing CBCP with data from interactive or real-world empathy elicitation protocols.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to CBCP Dataset.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube