- The paper demonstrates a robust method for quantifying interfering and high-risk behaviors in real-world ASD classrooms using privacy-preserving video analysis.
- It leverages multi-person 2D pose estimation, temporal attention, and a person attention model to effectively monitor and detect targeted behaviors.
- The study offers practical insights for continuous classroom monitoring while advancing theoretical approaches for explainable AI in real-world settings.
Explainable Artificial Intelligence for Quantifying Interfering and High-Risk Behaviors in Autism Spectrum Disorder in a Real-World Classroom Environment Using Privacy-Preserving Video Analysis
Overview
The paper explores the application of explainable artificial intelligence (XAI) techniques to quantify interfering and high-risk behaviors in children with Autism Spectrum Disorder (ASD) in real-world classroom environments. This work leverages state-of-the-art video-based group activity recognition methodologies to monitor behaviors such as aggression, self-injury, and restricted repetitive behaviors (RRBs) in ASD, focusing on preserving privacy. The research stands out by demonstrating the utility of these techniques in real-world, uncontrolled settings rather than controlled laboratory environments.
Methodology
The core methodology involves several advanced AI-powered components:
- Multi-person 2D Pose Estimation and Tracking: The study employs DEKR for detecting body joints and uses the Hungarian matching algorithm for tracking individuals over time within the video frames.
- Body Joint and Temporal Attention: The model applies attention mechanisms to capture relevant joint movements and timeframes critical for recognizing target behaviors.
- Person Attention Model: This component identifies the most relevant individual exhibiting the target behavior in a group activity setting, enhancing the model's focus and accuracy.
Data Collection and Setting
The study was conducted at The Center for Discovery (TCFD), a specialized center in New York. It involved nine male students aged 12 to 20, diagnosed with ASD and moderate to severe intellectual disability. Data were collected using two cameras with different angles (top-down and side views), capturing a comprehensive view of classroom activities. The dataset consisted of approximately 44 hours of video recordings, manually annotated for target behaviors by trained research assistants.
Results
The proposed XAI system demonstrated a commendable performance with a 77% F1-score for detecting target behaviors using top-down view videos. The study evaluated the model’s performance against several baseline and attention-based models, indicating that the Person Attention model yielded the best results. Additionally, the system showed some capacity for behavior prediction, although with limited accuracy (53% F1-score for a 3-minute prediction horizon).
The detailed results for each behavior category revealed that restricted repetitive behaviors were detected with the highest true positive rate (TPR of 39.48%), while elopement behaviors had the lowest (0.05%).
Implications and Future Work
This research implies significant practical and theoretical advancements in monitoring ASD behaviors:
- Practical Implications: The developed system can alleviate the burden on classroom staff, providing continuous and objective behavior monitoring without the need for dedicated observers. This can enhance intervention accuracy and resource allocation in educational settings catering to children with ASD.
- Theoretical Implications: The study expands the understanding of applying XAI in real-world environments. The adoption of attention mechanisms at multiple levels (joint, temporal, and person) offers insights into creating more interpretable and robust AI models for complex group activity recognition tasks.
Future research would benefit from addressing limitations identified, such as the small dataset size and the need for further generalization. The authors propose extending the model to classify fine-grained behavior categories and incorporating multi-modal data (e.g., audio-visual analysis) to improve detection accuracy. Additionally, deploying the model in real-time environments with low-cost edge computing devices could facilitate broader adoption and validation.
In conclusion, the paper showcases an important step toward deploying practical, explainable AI tools for behavior analysis in ASD, providing a foundation for more extensive, longitudinal studies to monitor and support children with ASD in various educational settings.