On Optimizing Electrode Configuration for Wrist-Worn sEMG-Based Thumb Gesture Recognition

Published 6 Apr 2026 in cs.HC | (2604.04623v1)

Abstract: Thumb gestures provide an effective and unobtrusive input modality for wearable and always-available human-machine interaction. Wrist-worn surface electromyography (sEMG) has emerged as a promising approach for compact and wearable human-machine interfaces. However, compared to forearm sEMG, the impact of electrode configuration on wrist-based decoding performance remains understudied. We systematically investigated electrode configuration strategies for wrist-based thumb-movement recognition using high-density (HD) and low-density (LD) sEMG measurement systems. We considered factors such as muscle region, reference scheme, channel count, and spatial density of the electrode. Experimental results show that 1) extensor-side electrodes outperform flexor-side electrodes (HD: 0.871 vs. 0.821; LD: 0.769 vs. 0.705); 2) monopolar recordings consistently outperform bipolar configurations (15 channel with HD monopolar vs. LD bipolar: 0.885 vs. 0.823); and 3) increasing channel count enhances performance, but exhibits diminishing returns. We further show that electrode spatial distribution introduces a trade-off between spatial coverage and compactness. The findings suggest that the effectiveness of wrist-worn sEMG systems depends less on the deployment of a large number of electrodes in a broad sensing area and more on the optimization of electrode placement and the referencing scheme. This work provides practical guidelines for developing efficient wrist-worn sEMG-based gesture recognition systems.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper demonstrates an optimal electrode configuration strategy that enhances thumb gesture recognition through focused extensor muscle coverage and monopolar referencing.
It employs rigorous comparative analysis using both high-density and low-density arrays, alongside CNN and TCN classifiers, to quantify performance impacts.
Results indicate that a reduced, eight-channel monopolar setup achieves near-maximal accuracy while improving hardware efficiency for wearable sEMG interfaces.

Optimizing Electrode Configuration for Wrist-Worn sEMG-Based Thumb Gesture Recognition

Introduction

This work presents an exhaustive analysis of electrode configuration strategies for wrist-worn sEMG-based recognition of thumb gestures, focusing on deep learning-driven classification performance under systematic manipulation of muscle region coverage, reference scheme (monopolar vs. bipolar), channel count, and spatial distribution. Through concurrent high-density (HD) and low-density (LD) measurements, and comparison of CNN and TCN classifiers, the study directly addresses previously underexplored questions regarding optimal wrist-based sEMG configuration for compact, efficient human–machine interfaces.

The experimental setup is both comprehensive and technically rigorous, enabling the quantification of spatial and physiological factors influencing decoding accuracy. Emphasis is placed not only on accuracy maximization but also on hardware efficiency metrics, including the figure of merit (FOM), which jointly considers accuracy and spatial compactness.

Figure 1: Visual summary of the experimental paradigm, encompassing sensor arrangement, thumb gesture taxonomy, raw sEMG acquisition, motion capture pipeline, and the employed CNN.

Experimental Setup and Data Collection

The experiment utilized custom hardware with two distinct electrode arrays: the Trigno Maize HD sensor (16 channels per grid on both extensor and flexor sides; 32 channels total) and the Trigno Quattro LD bipolar system (15 channels). Electrode placement was strictly anatomical—the central columns of both grids aligned to the midline of the wrist to isolate extensor and flexor muscle contributions. Synchronized sEMG and four-view video motion capture enabled precise assignment of hand joint kinematics to sEMG traces for six discrete thumb gestures.

Evaluation Protocol and Models

sEMG signals underwent standard preprocessing (band-pass filtering, segmentation, windowing), with 250 ms segments at 50% overlap forming input to both CNN and TCN classifiers. The CNN architecture (depicted in Figure 1) employed two convolutional blocks and two dense layers, each with dropout for regularization. Ten-fold cross-validation with block-wise partitioning ensured robustness and avoided data leakage.

A rigorous ablation protocol systematically varied (a) region of coverage (extensor/flexor/both), (b) electrode count, (c) reference scheme, and (d) spatial density, including random and structured reductions in channel number and connectedness on the HD grid.

Results and Quantitative Assessment

Muscle Region and Sensor Placement

Extensor-side coverage yielded consistently higher accuracies than flexor (HD: 0.871 vs. 0.821; LD: 0.769 vs. 0.705), and combining both regions further increased performance (Figure 3a/b). These results are anatomically consistent, with extrinsic thumb extensors presenting stronger, more separable sEMG signatures at the wrist than flexors (dominated by FPL).

Figure 2: Quantitative assessment of region configuration, monopolar/bipolar referencing, and progressive channel reduction on classification accuracy.

Reference Scheme and Channel Count

Monopolar arrangements significantly outperformed bipolar across both density regimes (HD-monopolar 15ch vs. LD-bipolar 15ch: 0.885 vs. 0.823, $p < 0.01$ ). Increasing channel count monotonically improved performance, but exhibited diminishing returns, with eight-channel HD configurations retaining almost all the accuracy of 15-channel arrays (0.864 vs. 0.885). This establishes a practical upper bound for efficient hardware allocation.

Spatial Distribution and Density

Spatial configuration analysis (Figure 3 and Figure 4) found that, for fixed electrode count, broader spatial coverage (i.e., lower density, larger inter-electrode median distance) weakly but significantly improved classification accuracy (correlation coefficients up to $R=0.175$ for 6ch; $p \ll 0.05$ ). However, the FOM (accuracy per spatial area) peaked at higher densities, demonstrating that compact (closely packed) arrays are more efficient for on-device deployment even with a slight reduction in raw accuracy.

Figure 3: Schematic of spatial density maps derived from the HD Maize grid, demonstrating the high-, medium-, and low-density configurations.

Figure 4: Analysis of accuracy and FOM as a function of spatial density for small channel-count subsets, revealing compact arrays are more efficient even if accuracy peaks with broader coverage.

Explainability: Electrode Importance Analysis

Integrated gradients (IG) attribution mapping (Figure 5) revealed dominant importance of distal and radial-side electrodes, especially on the extensor aspect for swipe and tap gestures. These spatial attributions are consistent with anatomical organization of thumb motor units and validate the physiological relevance of the decoding models' learned structure.

Figure 5: Electrode-wise attribution maps for all gestures, as derived from IG; highlighting salient contacts and gesture-modality-specific spatial dependencies.

Discussion and Implications

The systematic isolation of each hardware and physiological variable provides strong empirical evidence that wrist-based sEMG decoding of thumb gestures is primarily limited by anatomical constraints and referencing schemes. Maximal coverage of the extensor muscle region—utilizing a reduced, monopolar, compact electrode set—optimizes both accuracy and hardware efficiency.

These insights have significant implications for real-world wearable sEMG interface design. Unlike prior forearm-focused literature, where broader coverage and extreme channel counts yielded proportional gains, wrist-worn systems profit minimally from increased electrode number or spatial area beyond eight well-placed contacts. As such, commercial or clinical devices should prioritize optimal monopolar placement and hardware simplification.

Moreover, the characteristic importance asymmetries across gestures motivate exploration of adaptive, gesture-specific electrode selection or dynamic reconfiguration. Finally, the documented spatial trade-offs situate MMG and hybrid sensor approaches as promising alternatives for overcoming wrist anatomical limitations, with possible advances in resolution and robustness.

Conclusion

This study establishes definitive guidelines for electrode configuration in wrist-worn sEMG-based thumb gesture recognition. Strong empirical evidence is provided for the superiority of monopolar recording, optimized extensor-centric placement, and channel reduction to as few as eight electrodes without substantive loss of classification accuracy. The proposed spatial analysis and FOM metric offer principled tools for hardware deployment decisions. These findings lay substantial groundwork for the development of compact, efficient, and robust next-generation wearable human–machine interfaces, and prompt future research into alternative signal modalities and adaptive sensor allocation.

Markdown Report Issue