- The paper demonstrates an optimal electrode configuration strategy that enhances thumb gesture recognition through focused extensor muscle coverage and monopolar referencing.
- It employs rigorous comparative analysis using both high-density and low-density arrays, alongside CNN and TCN classifiers, to quantify performance impacts.
- Results indicate that a reduced, eight-channel monopolar setup achieves near-maximal accuracy while improving hardware efficiency for wearable sEMG interfaces.
Optimizing Electrode Configuration for Wrist-Worn sEMG-Based Thumb Gesture Recognition
Introduction
This work presents an exhaustive analysis of electrode configuration strategies for wrist-worn sEMG-based recognition of thumb gestures, focusing on deep learning-driven classification performance under systematic manipulation of muscle region coverage, reference scheme (monopolar vs. bipolar), channel count, and spatial distribution. Through concurrent high-density (HD) and low-density (LD) measurements, and comparison of CNN and TCN classifiers, the study directly addresses previously underexplored questions regarding optimal wrist-based sEMG configuration for compact, efficient human–machine interfaces.
The experimental setup is both comprehensive and technically rigorous, enabling the quantification of spatial and physiological factors influencing decoding accuracy. Emphasis is placed not only on accuracy maximization but also on hardware efficiency metrics, including the figure of merit (FOM), which jointly considers accuracy and spatial compactness.
Figure 1: Visual summary of the experimental paradigm, encompassing sensor arrangement, thumb gesture taxonomy, raw sEMG acquisition, motion capture pipeline, and the employed CNN.
Experimental Setup and Data Collection
The experiment utilized custom hardware with two distinct electrode arrays: the Trigno Maize HD sensor (16 channels per grid on both extensor and flexor sides; 32 channels total) and the Trigno Quattro LD bipolar system (15 channels). Electrode placement was strictly anatomical—the central columns of both grids aligned to the midline of the wrist to isolate extensor and flexor muscle contributions. Synchronized sEMG and four-view video motion capture enabled precise assignment of hand joint kinematics to sEMG traces for six discrete thumb gestures.
Evaluation Protocol and Models
sEMG signals underwent standard preprocessing (band-pass filtering, segmentation, windowing), with 250 ms segments at 50% overlap forming input to both CNN and TCN classifiers. The CNN architecture (depicted in Figure 1) employed two convolutional blocks and two dense layers, each with dropout for regularization. Ten-fold cross-validation with block-wise partitioning ensured robustness and avoided data leakage.
A rigorous ablation protocol systematically varied (a) region of coverage (extensor/flexor/both), (b) electrode count, (c) reference scheme, and (d) spatial density, including random and structured reductions in channel number and connectedness on the HD grid.
Results and Quantitative Assessment
Muscle Region and Sensor Placement
Extensor-side coverage yielded consistently higher accuracies than flexor (HD: 0.871 vs. 0.821; LD: 0.769 vs. 0.705), and combining both regions further increased performance (Figure 3a/b). These results are anatomically consistent, with extrinsic thumb extensors presenting stronger, more separable sEMG signatures at the wrist than flexors (dominated by FPL).
Figure 2: Quantitative assessment of region configuration, monopolar/bipolar referencing, and progressive channel reduction on classification accuracy.
Reference Scheme and Channel Count
Monopolar arrangements significantly outperformed bipolar across both density regimes (HD-monopolar 15ch vs. LD-bipolar 15ch: 0.885 vs. 0.823, p<0.01). Increasing channel count monotonically improved performance, but exhibited diminishing returns, with eight-channel HD configurations retaining almost all the accuracy of 15-channel arrays (0.864 vs. 0.885). This establishes a practical upper bound for efficient hardware allocation.
Spatial Distribution and Density
Spatial configuration analysis (Figure 3 and Figure 4) found that, for fixed electrode count, broader spatial coverage (i.e., lower density, larger inter-electrode median distance) weakly but significantly improved classification accuracy (correlation coefficients up to R=0.175 for 6ch; p≪0.05). However, the FOM (accuracy per spatial area) peaked at higher densities, demonstrating that compact (closely packed) arrays are more efficient for on-device deployment even with a slight reduction in raw accuracy.
Figure 3: Schematic of spatial density maps derived from the HD Maize grid, demonstrating the high-, medium-, and low-density configurations.
Figure 4: Analysis of accuracy and FOM as a function of spatial density for small channel-count subsets, revealing compact arrays are more efficient even if accuracy peaks with broader coverage.
Explainability: Electrode Importance Analysis
Integrated gradients (IG) attribution mapping (Figure 5) revealed dominant importance of distal and radial-side electrodes, especially on the extensor aspect for swipe and tap gestures. These spatial attributions are consistent with anatomical organization of thumb motor units and validate the physiological relevance of the decoding models' learned structure.
Figure 5: Electrode-wise attribution maps for all gestures, as derived from IG; highlighting salient contacts and gesture-modality-specific spatial dependencies.
Discussion and Implications
The systematic isolation of each hardware and physiological variable provides strong empirical evidence that wrist-based sEMG decoding of thumb gestures is primarily limited by anatomical constraints and referencing schemes. Maximal coverage of the extensor muscle region—utilizing a reduced, monopolar, compact electrode set—optimizes both accuracy and hardware efficiency.
These insights have significant implications for real-world wearable sEMG interface design. Unlike prior forearm-focused literature, where broader coverage and extreme channel counts yielded proportional gains, wrist-worn systems profit minimally from increased electrode number or spatial area beyond eight well-placed contacts. As such, commercial or clinical devices should prioritize optimal monopolar placement and hardware simplification.
Moreover, the characteristic importance asymmetries across gestures motivate exploration of adaptive, gesture-specific electrode selection or dynamic reconfiguration. Finally, the documented spatial trade-offs situate MMG and hybrid sensor approaches as promising alternatives for overcoming wrist anatomical limitations, with possible advances in resolution and robustness.
Conclusion
This study establishes definitive guidelines for electrode configuration in wrist-worn sEMG-based thumb gesture recognition. Strong empirical evidence is provided for the superiority of monopolar recording, optimized extensor-centric placement, and channel reduction to as few as eight electrodes without substantive loss of classification accuracy. The proposed spatial analysis and FOM metric offer principled tools for hardware deployment decisions. These findings lay substantial groundwork for the development of compact, efficient, and robust next-generation wearable human–machine interfaces, and prompt future research into alternative signal modalities and adaptive sensor allocation.