Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-Modal EPW Control System

Updated 9 January 2026
  • The multi-modal EPW control system is a platform integrating joystick, speech, gesture, and EOG interfaces to provide adaptive, safe mobility for users with movement impairments.
  • It employs a four-layer architecture that fuses sensor data, real-time processing, and cloud connectivity to ensure continuous monitoring and rapid response to hazards.
  • It leverages data-driven LMPC and embedding-based methods to optimize input arbitration, enabling robust predictive control under dynamic conditions.

A Multi-Modal EPW (Electric-Powered Wheelchair) Control System integrates diverse user-intent interfaces and advanced supervisory logic to enable robust, adaptive, and clinically compliant mobility for individuals with significant movement impairments. The term "multi-modal" indicates concurrent support for heterogeneous control channels—joystick, speech, hand gesture, and electrooculogram (EOG)—uniquely prioritized via arbitration algorithms for seamless, context-sensitive operation. State-of-the-art implementations also incorporate medical-grade biophysical monitoring, data-driven predictive control frameworks, and rigorous safety mechanisms consistent with ISO and IEC standards (Hossain et al., 6 Jan 2026).

1. Layered System Architecture and Core Components

The system architecture adheres to a four-layer hierarchy: Sensing, Processing, Communication, and IoT/Cloud (see Fig. 1 of (Hossain et al., 6 Jan 2026)).

  • Sensing Layer: Captures multimodal user inputs and physiological signals.
    • Control interfaces: Analog joystick (X–Y potentiometers + push-button), speech via smartphone microphone, hand-gesture via glove-mounted ADXL345 accelerometer, EOG using LM358N-based signal acquisition.
    • Biophysical sensors: MAX30100 (SpO₂/HR, 18-bit ADC), DS18B20 (skin temp), ADXL345 (fall/convulsion).
  • Processing Layer:
    • NodeMCU ESP32 microcontroller (dual-core, 240 MHz, FreeRTOS, 520 kB SRAM).
    • Sensor buses: I²C (MAX30100, DS18B20), SPI/I²C (IMU), dual 12-bit ADCs (for analog joystick/EOG).
    • L298N dual H-bridge motor driver for actuation.
  • Communication Layer: Transmits sensor and control data.
    • Wi-Fi (IEEE 802.11 b/g/n): Cloud uplink (ThingSpeak server).
    • BLE (AES-128/CCM encryption): Low-latency link with Android caregiver app.
  • IoT/Cloud Layer:
    • Time-series logging, dashboards, and real-time caregivers alerts via Android (Thunkable) app.
    • Cloud-based vital-sign monitoring, secure alerting (SMS/email, in-app).

Block interconnection (ASCII schematic as per (Hossain et al., 6 Jan 2026)):

1
2
3
4
5
6
[Joystick] ─┐
[Speech]  ─┼─> ESP32 ─> L298N ─> [Motors/Wheels]
[Gesture] ─┤
[EOG]     ─┘
│                          ├─> Wi-Fi ─> Cloud/ThingSpeak ─> Android App
                           └─> BLE ─────────────────────────────┘

2. Signal Processing, Feature Extraction, and Interface Logic

  • Joystick: Analog [0–3.3 V] mapped to 12-bit digital; dead-zone elimination and linear scaling yield PWM duty cycle for motor commands (50 Hz).
  • Speech: Android’s built-in ASR restricts input to {forward, back, left, right, stop}, transmitted over BLE for decoding and actuation.
  • Gesture: ADXL345 tilt data processed within ±200 ms window; threshold-based detection maps to navigation actions.
  • EOG: Differential electrode placement, LM358N amplification, band-pass filtering (0.1–35 Hz), feature-detection of sustained horizontal/vertical deviations (>12° for ≥2 s) and double-blink for "stop".

No deep learning classifiers are deployed; control relies on fixed feature extraction and rule-based logic. Future directions include convolutional architectures and SVM/CNN for non-analog modalities.

3. Mode Arbitration, Safety Logic, and Real-time Control

Control prioritization leverages a "priority-ladder" scheme:

  • Hazard-first logic: If FallFlag ∨ HealthAlert ∨ ObstacleFlag is true, the system executes SafeHalt→Stop immediately.
  • Mode arbitration: Retains last-used non-hazardous mode or transitions per channel availability and command validity.
  • User-mode manual selection: Four dedicated push-buttons.
  • Fixed-step control loop: 50 Hz (20 ms), managed with FreeRTOS task scheduling—sensor polling, arbitration, PWM updates.

Latency analysis:

Source Value (ms)
Sensor ADC ≈0.3
ESP32 processing ≈0.5
Motor driver settling ≈0.2
Wi-Fi (health uplink) ≈4
BLE encryption overhead 0.004

Aggregate closed-loop latency (voice/gesture/EOG → actuation): 20 ± 0.5 ms. System draws ≈8.4 W at 24 V (≈350 mA); BLE encryption overhead <1%. Battery runtime: >10 h (5 Ah pack).

4. Calibration, Biophysical Monitoring, and Cloud Alerting

  • Biophysical sensor calibration utilizes two-point referencing:
    • MAX30100 vs. ISO 80601-2-61 pulse-ox simulator
    • DS18B20: ice-water (0 ℃), boiling-water (100 ℃)
    • ADXL345: static ±1g testing

Root-mean-square errors:

  • Heart rate: ≤2 bpm (#samples N=80; mean bias ≈0.2 bpm)
  • SpO₂: ≤1% (mean bias –0.3%)
  • Temp: ≤0.5 ℃ (mean bias +0.05 ℃)
  • Cloud telemetry: ESP32 aggregates and pushes SpO₂, HR, temp, and fall-state every 1 s to ThingSpeak (Wi-Fi). BLE provides fallback/local streaming. Alerts for HR>140 bpm/<40 bpm, temp>38.5 ℃, SpO₂ < 90% are issued as SMS/email (SMTP) and in-app indicators.
  • ISO/IEC compliance: Sensor front-end (ISO 80601-2-61), system safety (ISO 7176-31), medical alarm compatibility (IEC 80601-2-78). Critical risk mitigations include watchdog timer, latched emergency stop, battery protection.

5. Data-Driven and Embedding-Based Multi-Modal Control Methodologies

Two advanced paradigms generalize multi-modal EPW control for dynamic and uncertain environments:

5.1 Data-Driven Multi-Modal LMPC

  • Affine Time-Varying (ATV) Modeling: Learns local dynamics xk+1Akxk+Bkuk+wkx_{k+1} \approx A_k x_k + B_k u_k + w_k from historical trajectories sampled across "modes" (e.g., friction variants of floor material: carpet/tile/pavement) (Kopp et al., 2024).
  • Sampling Safe Sets: Constructs a convex hull from nearest prior feasible states, ensuring recursive feasibility via tube-based constraint tightening.
  • LMPC Optimization: At each instant, leverages ATV-identified models and convex safe sets to solve for optimal input sequences:

minu,λk(xk,uk)+V(xt+N,λ) s.t. xk+1=Akxk+Bkuk+wk,xkX,ukU\min_{u, \lambda} \sum_{k} \ell(x_k, u_k) + \overline{V}(x_{t+N}, \lambda) \text{ s.t. } x_{k+1} = A_k x_k + B_k u_k + w_k, x_k \in X_{\ominus}, u_k \in U_{\ominus}

  • Mode adaptation: Pseudo-real-time update via selection of nearest data neighbors, with robust fallback (LQR) in case of insufficient local data.

5.2 Embedding Methods for Switched Optimal Control

  • Binary embedding: Encodes each of MM modes as b=log2Mb=\lceil\log_2 M\rceil binary variables vi{0,1}v_i \in \{0,1\} replacing the discrete switching law by continuous relaxed variables vi[0,1]v_i \in [0,1] (Sakha et al., 14 Dec 2025).
  • Embedded dynamics/costs: Constructs the weighted sum of subsystem dynamics (and costs) using mode indicator polynomials Vk(v)V_k(v) (see original for explicit form).
  • Concave auxiliary penalty: Forces bang–bang minimizers of the relaxed embedding and excludes invalid mode bitstrings, via:

LM(v)=αi=0b1vi(1vi)+βk=M2b1i:ki=1viL_M(v) = \alpha \sum_{i=0}^{b-1} v_i (1-v_i) + \beta \sum_{k=M}^{2^b-1} \prod_{i: k_i=1} v_i

  • Application to EPW: For example, M=5M=5 (idle, fwd, rev, turn-L, turn-R), b=3b=3 (bits); direct collocation solvers guarantee boundary (binary) optimal schedules implementable directly on physical wheelchair platforms.

6. Experimental Performance, Safety, and Clinical Outcomes

  • Recognition accuracy (20 participants, N=500N=500 commands):
Modality Mean Accuracy (95% CI)
Joystick 99% (±0.5%)
Speech 97% (±2%)
Gesture 95% (±3%)
EOG 96% (±2.5%)
  • Biophysical sensor validation: Pearson correlation and Bland-Altman limits confirm medical-grade precision (HR r=0.98r=0.98, Temp r=0.93r=0.93, SpO₂ r=0.74r=0.74; see source for detailed plots).
  • System endurance: >10 h per charge with all input modalities and telemetry enabled.
  • Safety outcomes: Emergency stop and real-time alerts enabled by cloud-integrated sensing and arbitration logic.

This architecture comprehensively addresses accessibility, adaptability, and clinical oversight, and lays the foundation for future semantic intent prediction, semi-autonomous navigation (SLAM), and further autonomy/power-optimization enhancements (Hossain et al., 6 Jan 2026).

7. Limitations and Prospects for Future Development

  • Algorithmic extensibility: Present implementations utilize rule-based classification; future work is directed at integrating convolutional attention modules (CNN/CBAM for vision) and SVM/CNN for EOG/gaze decoding.
  • Predictive control: Adoption of data-driven LMPC and embedding-formulation SOCP frameworks offers robust performance under model uncertainty and can accommodate mixed-discrete (mode) and continuous control objectives (Kopp et al., 2024, Sakha et al., 14 Dec 2025).
  • Safety/standards: Platform aligns with ISO 7176-31 and IEC 80601-2-78, continually adapting to evolving risk profiles through cloud-based analysis and machine learning.
  • Clinical scalability: Initial results verify command accuracy above 95% for varied modalities; large-scale longitudinal studies and adaptive intent modeling are required for broader deployment.

A plausible implication is the future integration of high-dimensional sensor fusion, predictive health event analysis, and shared-control autonomy, capitalizing on the multi-modal system’s extensible architecture for both research and advanced clinical deployment.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multi-Modal EPW Control System.