Adapt MobileNet architectures for event and RF inputs in HAR

Develop and evaluate adaptations of the MobileNet family of convolutional neural networks that accept event camera inputs and radio-frequency sensing inputs for human action recognition, enabling efficient temporal modeling suitable for edge deployment and addressing the compute–temporal fidelity trade-off for event-based and RF data.

Background

Within the discussion of compute–temporal fidelity trade-offs, the paper notes that while 3D CNNs achieve high accuracy, they are computationally heavy, and lighter architectures such as TSM or MobileNet reduce cost with some accuracy trade-offs. For event-based and RF modalities, this trade-off is stated to be insufficiently studied.

The authors explicitly identify adapting MobileNet-class architectures to event camera and RF inputs as an open area, motivated by the need for efficient spatio-temporal modeling on privacy-preserving modalities and resource-constrained edge devices.

References

Adapting architectures such as MobileNet for event or RF inputs is challenging and remains an open area for research [F29].

A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential  (2511.03665 - Dilmaghani et al., 5 Nov 2025) in Subsection "Summary of Gaps in Literature" (Compute vs. temporal fidelity)