IMUOptimize: A Data-Driven Approach to Optimal IMU Placement for Human Pose Estimation with Transformer Architecture (2402.08923v2)
Abstract: This paper presents a novel approach for predicting human poses using IMU data, diverging from previous studies such as DIP-IMU, IMUPoser, and TransPose, which use up to 6 IMUs in conjunction with bidirectional RNNs. We introduce two main innovations: a data-driven strategy for optimal IMU placement and a transformer-based model architecture for time series analysis. Our findings indicate that our approach not only outperforms traditional 6 IMU-based biRNN models but also that the transformer architecture significantly enhances pose reconstruction from data obtained from 24 IMU locations, with equivalent performance to biRNNs when using only 6 IMUs. The enhanced accuracy provided by our optimally chosen locations, when coupled with the parallelizability and performance of transformers, provides significant improvements to the field of IMU-based pose estimation.
- activpal: Pal technologies.
- Captum · Model Interpretability for PyTorch.
- Motion Capture | Movella.com.
- Advanced Computing Center for the Arts and Design. ACCAD MoCap Dataset.
- Breiman, L. Random Forests. Machine Learning 45, 1 (2001), 5–32.
- Carnegie Mellon University. CMU MoCap Dataset.
- Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time. ACM Trans. Graph. 37, 6 (Dec. 2018), 1–15.
- AMASS: Archive of motion capture as surface shapes. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (Oct. 2019), pp. 5441–5450.
- IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg Germany, Apr. 2023), ACM, pp. 1–12.
- Troje, N. F. Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision 2, 5 (Sept. 2002), 2–2.
- Total capture: 3d human pose estimation fusing video and inertial sensors. In Proceedings of the British Machine Vision Conference (BMVC) (Sept. 2017), pp. 14.1–14.13.
- Attention is all you need. In Advances in Neural Information Processing Systems (2017), I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30, Curran Associates, Inc.
- Transpose: Keypoint localization via transformer. In IEEE/CVF International Conference on Computer Vision (ICCV) (2021).
- Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors, Mar. 2022. arXiv:2203.08528 [cs].
- TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors, May 2021. arXiv:2105.04605 [cs].