BioMoTouch: Touch-Based Behavioral Authentication via Biometric-Motion Interaction Modeling

Published 8 Apr 2026 in cs.HC and cs.CR | (2604.07071v1)

Abstract: Touch-based authentication is widely deployed on mobile devices due to its convenience and seamless user experience. However, existing systems largely model touch interaction as a purely behavioral signal, overlooking its intrinsic multidimensional nature and limiting robustness against sophisticated adversarial behaviors and real-world variations. In this work, we present BioMoTouch, a multi-modal touch authentication framework on mobile devices grounded in a key empirical finding: during touch interaction, inertial sensors capture user-specific behavioral dynamics, while capacitive screens simultaneously capture physiological characteristics related to finger morphology and skeletal structure. Building upon this insight, BioMoTouch jointly models physiological contact structures and behavioral motion dynamics by integrating capacitive touchscreen signals with inertial measurements. Rather than combining independent decisions, the framework explicitly learns their coordinated interaction to form a unified representation of touch behavior. BioMoTouch operates implicitly during natural user interactions and requires no additional hardware, enabling practical deployment on commodity mobile devices. We evaluate BioMoTouch with 38 participants under realistic usage conditions. Experimental results show that BioMoTouch achieves a balanced accuracy of 99.71% and an equal error rate of 0.27%. Moreover, it maintains false acceptance rates below 0.90% under artificial replication, mimicry, and puppet attack scenarios, demonstrating strong robustness against partial-factor manipulation.

Abstract PDF Upgrade to Chat

Authors (9)

Summary

The paper introduces BioMoTouch, integrating touchscreen and IMU data with deep learning for robust behavioral authentication.
It leverages multimodal fusion with TinyViT and one-class classifiers to achieve 99.71% BAC and a 0.27% EER, outperforming conventional biometrics.
Experiments demonstrate resilience against mimicry, artificial replication, and puppet attacks with consistent accuracy over five weeks.

Motivation and Problem Statement

BioMoTouch addresses critical limitations of conventional biometric authentication on mobile devices, specifically the vulnerabilities inherent in static biometrics (fingerprint and facial features) and the challenges posed by advanced adversarial attacks, such as artificial replication and puppet attacks. The central claim of this work is that touch interaction is an intrinsically multi-dimensional signal: capacitive touchscreens capture physiological traits arising from finger morphology, while inertial sensors record behavioral dynamics. The integration and explicit modeling of these modalities are hypothesized to yield robust authentication resistant to partial-factor manipulation.

System Architecture and Methodology

BioMoTouch operates by implicitly collecting capacitive touchscreen and inertial measurement unit (IMU) data during natural device usage, with no additional hardware requirements, facilitating deployment on commodity hardware. The workflow consists of data acquisition, preprocessing, feature extraction, multimodal fusion, and user-specific one-class classification.

Figure 1: The workflow of BioMoTouch illustrating the modalities, preprocessing, and feature fusion pipelines.

Data Collection and Preprocessing

Experimental data were obtained from 38 participants, with capacitive images sampled at 20 fps and IMU signals at 200 Hz. The preprocessing pipeline includes adaptive touch detection via median/MAD thresholding, spatial region tracking, and temporal smoothing for capacitive frames, and wavelet denoising, quaternion-based orientation estimation, and STFT-based spectral feature extraction for IMU data. Cross-modal temporal alignment ensures synchronized pairing of physiological and motion features.

Figure 2: Illustration of the data collection process, depicting user-device interaction and sensor streams.

Feature Engineering and Multimodal Representation

Feature extraction leverages time-frequency analysis (STFT) for IMU signals—incorporating accelerometry and quaternion-derived roll/pitch/yaw—and applies temporal warping and amplitude-adaptive noise augmentation to capacitive data, simulating natural interaction variability. Both modalities utilize TinyViT backbone architectures for deep embedding extraction. The fusion network, a two-layer MLP with LeakyReLU and dropout, produces a 320-dimensional representation emphasizing coordinated physiological-behavioral coupling.

Figure 3: Two samples of User A, visualizing IMU STFT spectra across axes and angles, showcasing intra-user spectral consistency.

Figure 4: Visualized feature space of raw and augmented touch interaction data under PCA, delineating compact and well-separated user clusters.

Authentication Protocol and Attack Modeling

BioMoTouch frames authentication as a one-class classification task, utilizing OC-SVM, LOF, and IF as legitimate user profilers. The threat model incorporates mimicry attack (behavior imitation), artificial replication (fabrication of biometric traits), and puppet attack (forced use of genuine biometrics). Empirical evaluation confirms low EER and FAR across all attack scenarios, even against challenges that bypass liveness detection.

Figure 5: Fabrication procedure of fingerprint spoofs, detailing the physical replication protocol for adversarial testing.

Figure 6: Genuine image, illustrating ground-truth comparison in spoof resistance evaluation.

Experimental Results and Numerical Highlights

The main authentication dataset generated BAC of 99.71% and EER of 0.27% (TinyViT + OC-SVM configuration). Modality ablation revealed that while capacitive-based features encode user-specific physiological signatures (EER = 1.00%), optimal robustness is only achieved with multimodal fusion (EER = 0.27%). BioMoTouch maintained FAR below 0.90% across artificial replication, mimicry, and puppet attacks, outperforming commercial fingerprint sensors (Live20R: FAR up to 100% under puppet).

Figure 7: ROC curves of the IMU-based method, showing limited discriminative power in isolation.

Figure 8: Decision score distributions of the IMU-based method, exhibiting overlap between genuine and impostor classes.

Reliability and Deployment Robustness

Longitudinal trials demonstrated temporal stability, with EERs consistently below 1.08% across five weeks. BioMoTouch remained effective with different fingers (EER range 0.40%-0.95%), user postures (EER <1.48% even when walking), finger moisture (EER = 1.24% wet), and screen protectors (EER ≤ 0.58% for all types).

Figure 9: Long-term EER comparison over five weeks, showcasing temporal robustness versus single-modality baselines.

Figure 10: EERs of different one-class classifiers across fingers, revealing biometric consistency across thumb, index, and middle fingers.

Figure 11: EERs under different user postures, affirming resilience to physical and behavioral context shifts.

Implications and Future Perspectives

The explicit modeling of physiological-behavioral coupling opens new avenues for seamless, unobtrusive, and hardware-free mobile authentication. Results challenge the premise that commodity capacitive screens lack fine-grained physiological discriminability. On the theoretical side, BioMoTouch proposes a new paradigm in adversarial robustness—independence assumptions between biometric modalities are suboptimal, and coordinated fusion produces measurable security gains. From a practical standpoint, the framework is amenable to integration as an auxiliary behavioral biometric, enhancing security for PIN and fingerprint unlocking workflows. Future directions include domain adaptation for cross-device generalization, continuous authentication during active sessions, and expansion of multimodal fusion to additional sensor types.

Conclusion

BioMoTouch delivers a multi-modal behavioral authentication framework that achieves high accuracy, resilience against advanced spoofing vectors, and robust operation across diverse environmental conditions. The strong numerical results validate the core hypothesis of coordinated biometric-motion modeling and its practical utility in strengthening mobile security protocols. The broader implication is a shift toward implicit, liveness-independent, and multimodal behavioral biometrics for both authentication and continuous security monitoring on commodity devices.

Markdown Report Issue