Self-Developed Wrist-Worn Device
- Self-developed wrist-worn devices are self-constructed wearables that integrate diverse sensors and actuators for tracking physiology, motion, and cognition.
- They utilize multiple architectures including ultrasound, optical PPG, resistive touch grids, IMUs, and EMG, ensuring precise and adaptable performance.
- Embedded machine learning and optimized signal processing enable real-time data fusion and low-power operation for continuous user monitoring.
A self-developed wrist-worn device refers to a wearable system implemented on the wrist and built or prototyped by the end user, engineer, or researcher rather than as a commercial “black box” product. This category encompasses a broad spectrum of functionalities, technical approaches, and application domains. Devices can be designed for continuous physiological monitoring, 3D motion capture, cognitive assessment, rich multimodal input, or advanced haptic actuation. Researchers have detailed comprehensive workflows for the hardware, firmware, and evaluation of such devices across diverse studies including SonarWatch, EchoWrist, WristSketcher, Pneutouch, and others. The defining characteristic is the explicit, open specification and construction pathway—enabling replication, extension, and critical comparison.
1. Hardware Architectures and Sensing Modalities
Self-developed wrist-worn systems implement a wide variety of sensing principles, typically chosen based on task and integration constraints. Key modalities include:
- Ultrasound Transduction: SonarWatch and EchoWrist employ ultrasonic speakers and microphones arranged across the watch body to sense the environment via reflected chirps or FMCW sweeps. Typical bands are 16.5–20 kHz (SonarWatch) or 20–24 kHz (EchoWrist), with transducers positioned on opposing sides of the device or under the display (Shi et al., 22 Aug 2024, Lee et al., 30 Jan 2024).
- Optical PPG: Photoplethysmographic (PPG) sensors use green LEDs (~520nm) and photodiodes for pulse waveform acquisition. These can provide both heart rate and activity features at very low energy budgets (Brophy et al., 2020).
- Pressure and Touch Grids: WristSketcher incorporates a flexible, multilayer resistive grid (44×44 points) for high-resolution surface interaction and dynamic 2D input (Ying et al., 2022).
- Inertial (IMU): 9-axis IMUs (3×accel, 3×gyro, 3×mag) are standard for orientation, movement, and gesture context, requiring sampling rates ≥200 Hz for interactive use (Shi et al., 22 Aug 2024).
- EMG: Surface electromyography, using gold-plated dry electrodes and high-impedance analog front-ends, allows muscle activity quantification for force and intent inference (Xiao et al., 5 Oct 2025).
- Haptic Actuation: Custom actuators range from voice-coil force feedback (CoWrHap) to linear-resonant, vibrotactile grid arrays (Heterogeneous Stroke) and complex pneumatic inflatables (Pneutouch) (Adeyemi et al., 2023, Kim et al., 20 Nov 2025, Liu et al., 30 Jan 2025).
Device form factors are heavily optimized for skin contact, stability under motion, and battery integration—often within a 30–60 g mass and profile below 10 mm, except in advanced haptic cases (e.g., Pneutouch at 584 g) where additional actuators or pumps are required.
2. Signal Processing and Data Fusion Pipelines
Raw sensor data is subject to multi-stage signal processing prior to downstream inference or interaction:
- Ultrasound/Acoustic: Cross-correlation with reference chirps yields time-of-flight (ToF) for distance estimation (SonarWatch), while frequency-domain profiles (short-time energy, STFT) or 2D spatiotemporal “echo cubes” feed learned models for pose or interaction class (Shi et al., 22 Aug 2024, Lee et al., 30 Jan 2024). Acoustic data is typically bandpass-filtered to suppress ambient noise or aliasing.
- PPG: Downsampling (to 5–10 Hz) following anti-alias LPF, windowing (8s), and conversion to waveform images (HAR) or direct 1D CNN input (HR estimation) enable simultaneous monitoring at ultra-low power (Brophy et al., 2020).
- Pressure/Touch Grids: Noise is mitigated via median filtering, thresholding, and connected-component analysis for touch isolation, followed by temporal smoothing for coordinate stability and simple timing rules for gesture segmentation (Ying et al., 2022).
- IMU/EMG Fusion: Modern frameworks leverage dual-branch transformer or LSTM-infused architectures, with cross-modal attention to align proprioceptive and muscle-derived streams for fine-grained pose and force inference. Preprocessing includes baseline normalization, envelope extraction, and quaternion-based orientation representations (Xiao et al., 5 Oct 2025).
- Haptic Encoding: For complex tactile feedback, spatiotemporal actuation is programmed using STP (spatiotemporal pattern) encoding, with independently modulated burst parameters (frequency, roughness, ISI) for each tactor (Kim et al., 20 Nov 2025).
Signal processing is highly optimized for real-time operation, often exploiting low-power DSP or fixed-point pipelines on microcontrollers (e.g., Q1.15 arithmetic in PuLsE (Giordano et al., 21 Oct 2024)).
3. Embedded Machine Learning and Classification
Many self-developed wrist-worn devices integrate lightweight embedded ML models:
- Tree-Based Models: SonarWatch employs LightGBM classifiers on concatenated feature vectors from IMU and acoustic channels, supporting rapid, low-overhead inference (Shi et al., 22 Aug 2024).
- Deep CNN/Transformer: EchoWrist uses ResNet-18 for pose/interaction on echo cubes, while Wrist2Finger implements a dual-branch transformer with cross-modal fusion for simultaneous pose and force estimation (Lee et al., 30 Jan 2024, Xiao et al., 5 Oct 2025).
- Temporal Networks: LSTM and CNN-LSTM hybrids have been evaluated for EMG-only force regression, though cross-modal approaches demonstrate superior accuracy (force RMSE=0.213, r=0.76) (Xiao et al., 5 Oct 2025).
- Classical Rule-Based: Devices with limited computational budgets, such as WristSketcher, operate using purely threshold-based gesture segmentation or simple pseudocode state machines (Ying et al., 2022).
Model deployment is constrained by on-device flash and RAM, often requiring architectural shrinkage or quantization (e.g., transferring Inception-V3 HAR to lightweight CNNs for embedded use (Brophy et al., 2020)).
4. Evaluation Methodologies and Quantitative Performance
Published systems describe thorough validation regimes tailored to the specific sensing or actuation purpose:
- Gesture/Behavior Recognition: SonarWatch reports overall 12-class gesture recognition at 93.7%, static gesture (WristUp) accuracy at 97.6%, with Δaccuracy under 1% across 17–65 dB noise environments. WristSketcher achieves 96.0% gesture recognition and outperforms mid-air freehand input in drawing error despite minor increases in completion time (Shi et al., 22 Aug 2024, Ying et al., 2022).
- Physiological Sensing: PuLsE demonstrates HR extraction mean error ≈ 0.7 bpm (σ=2 bpm) at the wrist's lateral position, r=0.99 vs. ECG reference, and battery efficiency >7 days (Giordano et al., 21 Oct 2024). PPG-only approaches sustain HR RMSE ≈ 13–14 bpm and HAR ≈ 83% at 10 Hz sampling (Brophy et al., 2020).
- Hand Pose and Force: EchoWrist reconstructs 20-joint hand poses with MJEDE=4.8 mm and achieves 97.6% hand-object interaction accuracy after four fine-tuning sessions (Lee et al., 30 Jan 2024). Wrist2Finger attains MPJPE=0.57 cm (21 joints) and force RMSE=0.213 for per-finger grip estimation (Xiao et al., 5 Oct 2025).
- Haptic/Tactile Displays: Heterogeneous Stroke reports >92% accuracy for full alphanumeric STP transmission; CoWrHap and Pneutouch quantify force/realism via psychophysical and application-focused studies (Kim et al., 20 Nov 2025, Adeyemi et al., 2023, Liu et al., 30 Jan 2025).
- Cognitive Assessment: Multimodal RT platforms achieve millisecond-level accuracy across haptic, auditory, and visual conditions, statistically comparable to PC-based tools when corrected for stimulus and IMU response latency (Sarkar et al., 1 Sep 2025).
Most studies implement rigorous cross-validation (often leave-one-participant-out), confusion analysis for class overlap, and explicit error distributions for both static and dynamic tasks.
5. Software Architectures and Power Optimization
Self-developed wrist-worn systems require careful coordination of acquisition, processing, and power management layers:
- Firmware Pipeline: Real-time ISR-driven sampling, DMA buffering (audio/IMU), algorithmic feature extraction, and interrupt-driven event handling (stimulus delivery, actuator control) are canonical (Shi et al., 22 Aug 2024, Sarkar et al., 1 Sep 2025).
- Duty-Cycle Regulation: Acoustic and IMU modules are selectively activated; e.g., SonarWatch chirps are active 11% of the time (11.7 ms every 86 ms), and MCU sleeps between feature windows, cutting average draw to ~12.4 mW (Shi et al., 22 Aug 2024).
- Host Interaction: BLE UART, WiFi REST endpoints (Pneutouch), and USB serial interfaces are prevalent for event reporting, configuration, or data offloading (Liu et al., 30 Jan 2025).
- Embedded ML Efficiency: DSP routines for fixed-point FFT (CMSIS-DSP) and quantized inference (CMSIS-NN, TensorFlow Lite Micro) are critical for sustaining low-latency and battery operation (Giordano et al., 21 Oct 2024, Brophy et al., 2020).
Optimizations target sub-physiological power (often a few mW), with system lifetimes exceeding 1–7 days on 150–300 mAh lithium cells, subject to actuator or display power profiles.
6. Limitations, Open Challenges, and Future Developments
Despite significant progress, self-developed wrist-worn devices face several common constraints:
- Form Factor vs. Function: Achieving rich actuation (e.g., multi-modal haptics, pneumatic inflatables) in a truly watch-sized, ergonomically stable package remains an engineering challenge, with trade-offs in mass and battery endurance (Liu et al., 30 Jan 2025).
- Sensing Ambiguities: Distinguishing closely related gestures or tactile signals requires high spatial acuity and, increasingly, orthogonal feature encoding (frequency, roughness, cross-modal fusion) (Kim et al., 20 Nov 2025).
- Calibration Overhead: Systems reliant on personalized fine-tuning (EchoWrist, Wrist2Finger) must streamline onboarding for practical deployment. Per-user normalization (EMG baseline/max) and robust automated calibration are under active development (Xiao et al., 5 Oct 2025).
- Closed-Loop Control: Most haptic platforms remain open-loop; integrating real-time force feedback via embedded sensors will improve realism and reliability (Adeyemi et al., 2023, Liu et al., 30 Jan 2025).
- Power Constraints and Duty Cycling: For “always-on” or continuous monitoring use-cases, advanced event-driven duty-cycling, sleep interruption (tilt-detection), and low-leakage hardware components are essential (Shi et al., 22 Aug 2024).
Anticipated advances include higher-density sensor grids, fully untethered operation (BLE SoC+LiPo), multi-modal integration (audio, touch, EMG, PPG), and on-device federated learning for privacy-preserving personalization.
7. Comparative Table of Representative Devices
| Device/Paper | Primary Sensing/Output | Key Metric(s) |
|---|---|---|
| SonarWatch (Shi et al., 22 Aug 2024) | Ultrasound+IMU | 93.7% gesture accuracy |
| EchoWrist (Lee et al., 30 Jan 2024) | Ultrasound (FMCW) | 4.8 mm pose error, 97.6% interaction accuracy |
| WristSketcher (Ying et al., 2022) | Resistive touch grid | 96.0% gesture acc., higher drawing accuracy |
| PuLsE (Giordano et al., 21 Oct 2024) | Ultrasound pulses | 0.69 bpm HR error, 5.8 mW power |
| Pneutouch (Liu et al., 30 Jan 2025) | Pneumatic haptics | Enjoyment, realism > alternatives |
| CoWrHap (Adeyemi et al., 2023) | Voice-coil actuation | PSE closer to true for H-WNC, no hand dominance effect |
| Heterogeneous Stroke (Kim et al., 20 Nov 2025) | Vibrotactile display | >92% alphanumeric STP accuracy |
| Multimodal RT (Sarkar et al., 1 Sep 2025) | LED/buzzer/vibrator + IMU | ms-class RT, cross-modality equivalence |
This tabulation demonstrates the diversity and high specificity of modern wrist-worn prototypes, each engineered for distinct interactive, physiological, or perceptual research objectives.