Papers
Topics
Authors
Recent
Search
2000 character limit reached

HeartbeatCam: Self-Triggered Photo Elicitation of Stress Events Using Wearable Sensing

Published 5 Apr 2026 in cs.HC | (2604.04314v1)

Abstract: People often recognize what triggered their stress only after the moment has passed. In therapy, this can become a recurring problem: clients are asked to remember what happened between sessions, but the details that matter (where they were, what they saw and heard, what was happening around them) are easy to lose. We introduce HeartbeatCam, a wearable sensing system that gathers contextual information during moments of elevated stress. It uses a consumer smartwatch stress signal to trigger capture from an open-source AR glasses camera, recording a sparse image-audio clip that can later be reviewed and annotated. The system adopts an actionable sensing approach to mental healthcare, using physiological signals along with contextual capture to support collaborative interpretation of stress-triggering moments with mental health professionals.

Authors (2)

Summary

  • The paper introduces HeartbeatCam, a system that automatically captures contextual data during stress events using HRV-based triggers and AR glasses.
  • The paper demonstrates a novel methodology by integrating real-time ultra-short RMSSD HRV analysis with photo elicitation to enhance clinical evaluations.
  • The paper highlights initial clinical feedback indicating that actionably contextualized sensor data can improve therapeutic engagement and self-report accuracy.

HeartbeatCam: Self-Triggered Photo Elicitation of Stress Events Using Wearable Sensing

Introduction

The paper "HeartbeatCam: Self-Triggered Photo Elicitation of Stress Events Using Wearable Sensing" (2604.04314) presents a wearable system designed to augment therapeutic outcomes in mental health interventions by capturing naturalistic, first-person contextual data at moments of elevated physiological stress. The system integrates commercially available wrist-worn HRV sensors with open-source AR glasses to automatically trigger egocentric image and audio capture during detected stress episodes. This approach aims to overcome limitations of retrospective self-report in therapy and provide a structured record of environmental context, thereby enhancing clinical discussions on stress triggers, adherence, and emotion regulation strategies.

System Design and Workflow

The HeartbeatCam system comprises three primary architectural components:

  • A consumer-grade wrist-worn HRV sensor (e.g., Garmin) for continuous physiological monitoring,
  • Open-source AR glasses (Brilliant Labs Frame) for egocentric image and audio capture,
  • A mobile companion application orchestrating BLE communication, threshold-based stress event detection, data storage, and annotation.

Trigger logic is defined using ultra-short RMSSD HRV features, with a stress threshold individualized using a rolling one-week baseline. Stress detection events—defined as RMSSD dropping more than 1.5 standard deviations below baseline in a 25-second window—trigger glasses to capture a 720p image and a 3-second audio snippet. To prevent over-capture, image grabs are rate-limited to a maximum of one per minute, and session capture can be paused manually through the glasses.

Captured data are accessible for user annotation, with a deliberate delay of 24 hours before the image/audio is revealed to the user, reducing the risk of overstimulation. Metadata (time, HRV, heart rate) is available immediately. Batch export supports collaborative review in clinical sessions, operationalizing photo-elicitation workflows within therapy.

Comparative Context and Methodological Positioning

The system draws from and extends several streams of prior work:

  • Photo-elicitation in therapy: Existing literature substantiates the utility of visual records (e.g., SenseCam, wearable cameras) to cue autobiographical recall and facilitate self-insight in therapy [hodgesSenseCamWearableCamera2011]. However, prior solutions are limited by manual triggers or fixed intervals, leading to contextually irrelevant data [mairUsingWearableCamera2021]. HeartbeatCam uniquely operationalizes context-sensitive, just-in-time capture by coupling with physiological signals.
  • Stress event detection via HRV: Ultra-short HRV analysis for stress detection is a standard biomarker with growing evidence for reliability on consumer-grade wearables [Kim2018StressAHA, Castaldo2019UltrashortTH]. However, specificity is non-absolute, and confounding factors (physical activity, postural changes, caffeine) remain a recognized challenge.
  • Actionable sensing in clinical practice: There is consensus that passive sensing alone is insufficient for clinical interpretation; contextual augmentation is required to make sensed events actionable in care [adlerDetectionActionableSensing2024]. HeartbeatCam directly implements this actionable sensing paradigm, linking an objective trigger with rich subjective context and collaborative clinical review.

Preliminary Clinical Stakeholder Feedback

Structured walkthroughs with clinicians (N=2) indicate strong perceived clinical utility. Experts anticipate that HeartbeatCam could reduce "adherence and compliance gaps" in therapy, facilitate visualization of emotional trajectories, and provide data-driven feedback on therapeutic interventions. However, they also raise several critical issues:

  • Therapeutic alliance impact: There is a highlighted risk of undermining rapport if clinicians privilege sensor-generated data over patient self-report, particularly in cases of disagreement. This is a non-trivial issue in digital phenotyping literature and emphasizes the need for systems to reinforce, not destabilize, collaborative, client-centered interpretation.
  • Privacy and interpretability: Contextual data (image, audio) could expose sensitive or unintended information. Safeguards such as user-controlled annotation, delayed image revealing, and data minimization were positively evaluated.
  • Integration of psychometric tools: Future work should consider embedding clinically validated annotation frameworks (e.g., CBT worksheets) to systematize self-report and minimize interpretational ambiguity.

Limitations and Future Directions

HeartbeatCam currently suffers from limitations inherent to its reliance on ultra-short HRV as an exclusive stress proxy. HRV is sensitive to numerous physiological perturbations unrelated to psychological arousal, reducing the signal-to-noise ratio for relevant events. The authors propose integration of additional biosignals (skin temperature, IMU for activity/posture, geofencing for context) to further disambiguate stress-related episodes.

On the informatics front, the pipeline is poised for augmentation via semantic clustering of egocentric media using vision models. Automated clustering can provide scalable support for temporal and thematic review, a necessity as deployments scale in duration and number of participants [Ji2018InvariantICA].

There is an absence of large-scale, in-the-wild deployment or quantitative efficacy results. The paper proposes future field studies with therapy clients, benchmarking HeartbeatCam against conventional journaling/worksheet paradigms along metrics such as recall fidelity, engagement, and downstream clinical outcomes.

Implications for Practice and Theory

HeartbeatCam exemplifies a system architecture that meaningfully operationalizes "actionable sensing," aligning objective physiological detection with context capture and collaborative reflection. Practically, such systems hold promise for more ecologically valid, user-centered records of distress events, and may enhance the granularity and relevance of between-session therapeutic interventions. Theoretically, the work points to the necessity of considering the interpretational interface—how sensed and experienced data are presented, annotated, and discussed—rather than focusing on sensor accuracy in isolation.

The approach is extensible to adjunctive use in exposure therapies, PTSD interventions, and other conditions where environmental triggers are salient and recall is problematic [backEnhancingProlongedExposure2022, evansUsingSensorCapturedPatientGenerated2024]. The system also provides a template for broader digital phenotyping efforts that require actionable translation of passive physiological signals.

Conclusion

HeartbeatCam proposes a robust approach to contextualizing wearable physiological data for use in therapeutic settings, leveraging real-time stress detection to ground egocentric capture. Early qualitative feedback from clinicians supports its design rationale, while also surfacing critical considerations about data interpretation, privacy, and therapeutic process integration. The work lays groundwork for future research directions involving multimodal sensing fusion, AI-enabled context mining, and quantitative clinical validation in real-world deployments.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.