Multi-Modal EPW Control System

Updated 9 January 2026

The multi-modal EPW control system is a platform integrating joystick, speech, gesture, and EOG interfaces to provide adaptive, safe mobility for users with movement impairments.
It employs a four-layer architecture that fuses sensor data, real-time processing, and cloud connectivity to ensure continuous monitoring and rapid response to hazards.
It leverages data-driven LMPC and embedding-based methods to optimize input arbitration, enabling robust predictive control under dynamic conditions.

A Multi-Modal EPW (Electric-Powered Wheelchair) Control System integrates diverse user-intent interfaces and advanced supervisory logic to enable robust, adaptive, and clinically compliant mobility for individuals with significant movement impairments. The term "multi-modal" indicates concurrent support for heterogeneous control channels—joystick, speech, hand gesture, and electrooculogram (EOG)—uniquely prioritized via arbitration algorithms for seamless, context-sensitive operation. State-of-the-art implementations also incorporate medical-grade biophysical monitoring, data-driven predictive control frameworks, and rigorous safety mechanisms consistent with ISO and IEC standards (Hossain et al., 6 Jan 2026).

1. Layered System Architecture and Core Components

The system architecture adheres to a four-layer hierarchy: Sensing, Processing, Communication, and IoT/Cloud (see Fig. 1 of (Hossain et al., 6 Jan 2026)).

Sensing Layer: Captures multimodal user inputs and physiological signals.
- Control interfaces: Analog joystick (X–Y potentiometers + push-button), speech via smartphone microphone, hand-gesture via glove-mounted ADXL345 accelerometer, EOG using LM358N-based signal acquisition.
- Biophysical sensors: MAX30100 (SpO₂/HR, 18-bit ADC), DS18B20 (skin temp), ADXL345 (fall/convulsion).
Processing Layer:
- NodeMCU ESP32 microcontroller (dual-core, 240 MHz, FreeRTOS, 520 kB SRAM).
- Sensor buses: I²C (MAX30100, DS18B20), SPI/I²C (IMU), dual 12-bit ADCs (for analog joystick/EOG).
- L298N dual H-bridge motor driver for actuation.
Communication Layer: Transmits sensor and control data.
- Wi-Fi (IEEE 802.11 b/g/n): Cloud uplink (ThingSpeak server).
- BLE (AES-128/CCM encryption): Low-latency link with Android caregiver app.
IoT/Cloud Layer:
- Time-series logging, dashboards, and real-time caregivers alerts via Android (Thunkable) app.
- Cloud-based vital-sign monitoring, secure alerting (SMS/email, in-app).

Block interconnection (ASCII schematic as per (Hossain et al., 6 Jan 2026)):

[Joystick] ─┐
[Speech]  ─┼─> ESP32 ─> L298N ─> [Motors/Wheels]
[Gesture] ─┤
[EOG]     ─┘
│                          ├─> Wi-Fi ─> Cloud/ThingSpeak ─> Android App
                           └─> BLE ─────────────────────────────┘

2. Signal Processing, Feature Extraction, and Interface Logic

Joystick: Analog [0–3.3 V] mapped to 12-bit digital; dead-zone elimination and linear scaling yield PWM duty cycle for motor commands (50 Hz).
Speech: Android’s built-in ASR restricts input to {forward, back, left, right, stop}, transmitted over BLE for decoding and actuation.
Gesture: ADXL345 tilt data processed within ±200 ms window; threshold-based detection maps to navigation actions.
EOG: Differential electrode placement, LM358N amplification, band-pass filtering (0.1–35 Hz), feature-detection of sustained horizontal/vertical deviations (>12° for ≥2 s) and double-blink for "stop".

No deep learning classifiers are deployed; control relies on fixed feature extraction and rule-based logic. Future directions include convolutional architectures and SVM/CNN for non-analog modalities.

3. Mode Arbitration, Safety Logic, and Real-time Control

Control prioritization leverages a "priority-ladder" scheme:

Hazard-first logic: If FallFlag ∨ HealthAlert ∨ ObstacleFlag is true, the system executes SafeHalt→Stop immediately.
Mode arbitration: Retains last-used non-hazardous mode or transitions per channel availability and command validity.
User-mode manual selection: Four dedicated push-buttons.
Fixed-step control loop: 50 Hz (20 ms), managed with FreeRTOS task scheduling—sensor polling, arbitration, PWM updates.

Latency analysis:

Source	Value (ms)
Sensor ADC	≈0.3
ESP32 processing	≈0.5
Motor driver settling	≈0.2
Wi-Fi (health uplink)	≈4
BLE encryption overhead	0.004

Aggregate closed-loop latency (voice/gesture/EOG → actuation): 20 ± 0.5 ms. System draws ≈8.4 W at 24 V (≈350 mA); BLE encryption overhead <1%. Battery runtime: >10 h (5 Ah pack).

4. Calibration, Biophysical Monitoring, and Cloud Alerting

Biophysical sensor calibration utilizes two-point referencing:
- MAX30100 vs. ISO 80601-2-61 pulse-ox simulator
- DS18B20: ice-water (0 ℃), boiling-water (100 ℃)
- ADXL345: static ±1g testing

Root-mean-square errors:

Heart rate: ≤2 bpm (#samples N=80; mean bias ≈0.2 bpm)
SpO₂: ≤1% (mean bias –0.3%)
Temp: ≤0.5 ℃ (mean bias +0.05 ℃)
Cloud telemetry: ESP32 aggregates and pushes SpO₂, HR, temp, and fall-state every 1 s to ThingSpeak (Wi-Fi). BLE provides fallback/local streaming. Alerts for HR>140 bpm/<40 bpm, temp>38.5 ℃, SpO₂ < 90% are issued as SMS/email (SMTP) and in-app indicators.
ISO/IEC compliance: Sensor front-end (ISO 80601-2-61), system safety (ISO 7176-31), medical alarm compatibility (IEC 80601-2-78). Critical risk mitigations include watchdog timer, latched emergency stop, battery protection.

Two advanced paradigms generalize multi-modal EPW control for dynamic and uncertain environments:

Affine Time-Varying (ATV) Modeling: Learns local dynamics $x_{k+1} \approx A_k x_k + B_k u_k + w_k$ from historical trajectories sampled across "modes" (e.g., friction variants of floor material: carpet/tile/pavement) (Kopp et al., 2024).
Sampling Safe Sets: Constructs a convex hull from nearest prior feasible states, ensuring recursive feasibility via tube-based constraint tightening.
LMPC Optimization: At each instant, leverages ATV-identified models and convex safe sets to solve for optimal input sequences:

$\min_{u, \lambda} \sum_{k} \ell(x_k, u_k) + \overline{V}(x_{t+N}, \lambda) \text{ s.t. } x_{k+1} = A_k x_k + B_k u_k + w_k, x_k \in X_{\ominus}, u_k \in U_{\ominus}$

Mode adaptation: Pseudo-real-time update via selection of nearest data neighbors, with robust fallback (LQR) in case of insufficient local data.

5.2 Embedding Methods for Switched Optimal Control

Binary embedding: Encodes each of $M$ modes as $b=\lceil\log_2 M\rceil$ binary variables $v_i \in \{0,1\}$ replacing the discrete switching law by continuous relaxed variables $v_i \in [0,1]$ (Sakha et al., 14 Dec 2025).
Embedded dynamics/costs: Constructs the weighted sum of subsystem dynamics (and costs) using mode indicator polynomials $V_k(v)$ (see original for explicit form).
Concave auxiliary penalty: Forces bang–bang minimizers of the relaxed embedding and excludes invalid mode bitstrings, via:

$L_M(v) = \alpha \sum_{i=0}^{b-1} v_i (1-v_i) + \beta \sum_{k=M}^{2^b-1} \prod_{i: k_i=1} v_i$

Application to EPW: For example, $M=5$ (idle, fwd, rev, turn-L, turn-R), $b=3$ (bits); direct collocation solvers guarantee boundary (binary) optimal schedules implementable directly on physical wheelchair platforms.

6. Experimental Performance, Safety, and Clinical Outcomes

Recognition accuracy (20 participants, $N=500$ commands):

Modality	Mean Accuracy (95% CI)
Joystick	99% (±0.5%)
Speech	97% (±2%)
Gesture	95% (±3%)
EOG	96% (±2.5%)

Biophysical sensor validation: Pearson correlation and Bland-Altman limits confirm medical-grade precision (HR $r=0.98$ , Temp $r=0.93$ , SpO₂ $r=0.74$ ; see source for detailed plots).
System endurance: >10 h per charge with all input modalities and telemetry enabled.
Safety outcomes: Emergency stop and real-time alerts enabled by cloud-integrated sensing and arbitration logic.

This architecture comprehensively addresses accessibility, adaptability, and clinical oversight, and lays the foundation for future semantic intent prediction, semi-autonomous navigation (SLAM), and further autonomy/power-optimization enhancements (Hossain et al., 6 Jan 2026).

7. Limitations and Prospects for Future Development

Algorithmic extensibility: Present implementations utilize rule-based classification; future work is directed at integrating convolutional attention modules (CNN/CBAM for vision) and SVM/CNN for EOG/gaze decoding.
Predictive control: Adoption of data-driven LMPC and embedding-formulation SOCP frameworks offers robust performance under model uncertainty and can accommodate mixed-discrete (mode) and continuous control objectives (Kopp et al., 2024, Sakha et al., 14 Dec 2025).
Safety/standards: Platform aligns with ISO 7176-31 and IEC 80601-2-78, continually adapting to evolving risk profiles through cloud-based analysis and machine learning.
Clinical scalability: Initial results verify command accuracy above 95% for varied modalities; large-scale longitudinal studies and adaptive intent modeling are required for broader deployment.

A plausible implication is the future integration of high-dimensional sensor fusion, predictive health event analysis, and shared-control autonomy, capitalizing on the multi-modal system’s extensible architecture for both research and advanced clinical deployment.

Markdown Upgrade to Chat

References (3)

Advancing Assistive Robotics: Multi-Modal Navigation and Biophysical Monitoring for Next-Generation Wheelchairs (2026)

Data-Driven Multi-Modal Learning Model Predictive Control (2024)

On the embedding transformation for optimal control of multi-mode switched systems (2025)

Topic to Video (Beta)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Multi-Modal EPW Control System.

Multi-Modal EPW Control System

1. Layered System Architecture and Core Components

2. Signal Processing, Feature Extraction, and Interface Logic

3. Mode Arbitration, Safety Logic, and Real-time Control

4. Calibration, Biophysical Monitoring, and Cloud Alerting

5.2 Embedding Methods for Switched Optimal Control

6. Experimental Performance, Safety, and Clinical Outcomes

7. Limitations and Prospects for Future Development

Topic to Video (Beta)

Whiteboard

Follow Topic

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Multi-Modal EPW Control System

1. Layered System Architecture and Core Components

2. Signal Processing, Feature Extraction, and Interface Logic

3. Mode Arbitration, Safety Logic, and Real-time Control

4. Calibration, Biophysical Monitoring, and Cloud Alerting

5. Data-Driven and Embedding-Based Multi-Modal Control Methodologies

5.1 Data-Driven Multi-Modal LMPC

5.2 Embedding Methods for Switched Optimal Control

6. Experimental Performance, Safety, and Clinical Outcomes

7. Limitations and Prospects for Future Development

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research