EchoWrist: Continuous Hand Pose Tracking and Hand-Object Interaction Recognition Using Low-Power Active Acoustic Sensing On a Wristband (2401.17409v2)
Abstract: Our hands serve as a fundamental means of interaction with the world around us. Therefore, understanding hand poses and interaction context is critical for human-computer interaction. We present EchoWrist, a low-power wristband that continuously estimates 3D hand pose and recognizes hand-object interactions using active acoustic sensing. EchoWrist is equipped with two speakers emitting inaudible sound waves toward the hand. These sound waves interact with the hand and its surroundings through reflections and diffractions, carrying rich information about the hand's shape and the objects it interacts with. The information captured by the two microphones goes through a deep learning inference system that recovers hand poses and identifies various everyday hand activities. Results from the two 12-participant user studies show that EchoWrist is effective and efficient at tracking 3D hand poses and recognizing hand-object interactions. Operating at 57.9mW, EchoWrist is able to continuously reconstruct 20 3D hand joints with MJEDE of 4.81mm and recognize 12 naturalistic hand-object interactions with 97.6% accuracy.
- The Sound of One Hand: A Wrist-Mounted Bio-Acoustic Fingertip Gesture Interface. In CHI ’02 Extended Abstracts on Human Factors in Computing Systems (Minneapolis, Minnesota, USA) (CHI EA ’02). Association for Computing Machinery, New York, NY, USA, 724–725. https://doi.org/10.1145/506443.506566
- author removed for anonymous under review for CHI ’24. [n. d.]. Ring-a-Pose: A Ring for Continuous Hand Pose Tracking.
- Leveraging Sound and Wrist Motion to Detect Activities of Daily Living with Commodity Smartwatches. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 42 (jul 2022), 28 pages. https://doi.org/10.1145/3534582
- ViFin: Harness Passive Vibration to Continuous Micro Finger Writing with a Commodity Smartwatch. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 1, Article 45 (mar 2021), 25 pages. https://doi.org/10.1145/3448119
- Scaling Egocentric Vision: The EPIC-KITCHENS Dataset. In European Conference on Computer Vision (ECCV).
- Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100. International Journal of Computer Vision (IJCV) 130 (2022), 33–55. https://doi.org/10.1007/s11263-021-01531-2
- A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. 2634–2641. https://doi.org/10.1109/CVPR.2013.340
- Artem Dementyev and Joseph A. Paradiso. 2014. WristFlex: Low-Power Gesture Input with Wrist-Worn Pressure Sensors. In Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology (Honolulu, Hawaii, USA) (UIST ’14). Association for Computing Machinery, New York, NY, USA, 161–166. https://doi.org/10.1145/2642918.2647396
- In Situ with Bystanders of Augmented Reality Glasses: Perspectives on Recording and Privacy-Mediating Technologies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Toronto, Ontario, Canada) (CHI ’14). Association for Computing Machinery, New York, NY, USA, 2377–2386. https://doi.org/10.1145/2556288.2557352
- Nathan Devrio and Chris Harrison. 2022. DiscoBand: Multiview Depth-Sensing Smartwatch Strap for Hand, Body and Environment Tracking. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 56, 13 pages. https://doi.org/10.1145/3526113.3545634
- Hambone: A Bio-Acoustic Gesture Interface. 3–10. https://doi.org/10.1109/ISWC.2007.4373768
- Semi-Supervised Learning for Surface EMG-based Gesture Recognition. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2017/225
- What is That in Your Hand? Recognizing Grasped Objects via Forearm Electromyography Sensing. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 4, Article 161 (dec 2018), 24 pages. https://doi.org/10.1145/3287039
- 3d hand shape and pose estimation from a single rgb image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10833–10842.
- Acustico: Surface Tap Detection and Localization Using Wrist-Based Acoustic TDOA Sensing. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 406–419. https://doi.org/10.1145/3379337.3415901
- WristWhirl: One-Handed Continuous Smartwatch Input Using Wrist Gestures. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 861–872. https://doi.org/10.1145/2984511.2984563
- Skinput: Appropriating the Body as an Input Surface. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 453–462. https://doi.org/10.1145/1753326.1753394
- FingerTrak: Continuous 3D Hand Pose Tracking by Deep Learning Hand Silhouettes Captured by Miniature Thermal Cameras on Wrist. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 2, Article 71 (jun 2020), 24 pages. https://doi.org/10.1145/3397306
- WristAcoustic: Through-Wrist Acoustic Response Based Authentication for Smartwatches. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4, Article 167 (jan 2023), 34 pages. https://doi.org/10.1145/3569473
- Tap Systems Inc. 2022. Tap. Retrieved Sep 14, 2023 from https://www.tapwithus.com/product/tap-strap-2/
- Intel. 2022. Intel RealSense Technology. Retrieved Feb 12, 2023 from https://www.intel.com/content/www/us/en/architecture-and-technology/realsense-overview.html
- BeamBand: Hand Gesture Sensing with Ultrasonic Beamforming. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3290605.3300245
- User-Independent Real-Time Hand Gesture Recognition Based on Surface Electromyography. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services (Vienna, Austria) (MobileHCI ’17). Association for Computing Machinery, New York, NY, USA, Article 36, 7 pages. https://doi.org/10.1145/3098279.3098553
- Daehwa Kim and Chris Harrison. 2022. EtherPose: Continuous Hand Pose Tracking with Wrist-Worn Antenna Impedance Characteristic Sensing. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 58, 12 pages. https://doi.org/10.1145/3526113.3545665
- Digits: Freehand 3D Interactions Anywhere Using a Wrist-Worn Gloveless Sensor. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (Cambridge, Massachusetts, USA) (UIST ’12). Association for Computing Machinery, New York, NY, USA, 167–176. https://doi.org/10.1145/2380116.2380139
- The Gesture Watch: A Wireless Contact-free Gesture based Wrist Interface. 15–22. https://doi.org/10.1109/ISWC.2007.4373770
- Jiwan Kim and Ian Oakley. 2022. SonarID: Using Sonar to Identify Fingers on a Smartwatch. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (<conf-loc>, <city>New Orleans</city>, <state>LA</state>, <country>USA</country>, </conf-loc>) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 287, 10 pages. https://doi.org/10.1145/3491102.3501935
- Olya Kudina and Peter-Paul Verbeek. 2019. Ethics from within: Google Glass, the Collingridge dilemma, and the mediated value of privacy. Science, Technology, & Human Values 44, 2 (2019), 291–314.
- Acoustruments: Passive, Acoustically-Driven, Interactive Controls for Handheld Devices. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 2161–2170. https://doi.org/10.1145/2702123.2702414
- Gierad Laput and Chris Harrison. 2019. Sensing Fine-Grained Hand Activity with Smartwatches. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300568
- ViBand: High-Fidelity Bio-Acoustic Sensing Using Commodity Smartwatch Accelerometers. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 321–333. https://doi.org/10.1145/2984511.2984582
- Room-Scale Hand Gesture Recognition Using Smart Speakers. In Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems (Boston, Massachusetts) (SenSys ’22). Association for Computing Machinery, New York, NY, USA, 462–475. https://doi.org/10.1145/3560905.3568528
- EarIO: A Low-Power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 62 (jul 2022), 24 pages. https://doi.org/10.1145/3534621
- D-Touch: Recognizing and Predicting Fine-Grained Hand-Face Touching Activities Using a Neck-Mounted Wearable. In Proceedings of the 28th International Conference on Intelligent User Interfaces (Sydney, NSW, Australia) (IUI ’23). Association for Computing Machinery, New York, NY, USA, 569–583. https://doi.org/10.1145/3581641.3584063
- BackHand: Sensing Hand Gestures via Back of the Hand. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machinery, New York, NY, USA, 557–564. https://doi.org/10.1145/2807442.2807462
- WR-Hand: Wearable Armband Can Track User’s Hand. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 3, Article 118 (sep 2021), 27 pages. https://doi.org/10.1145/3478112
- NeuroPose: 3D Hand Pose Tracking Using EMG Wearables. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 1471–1482. https://doi.org/10.1145/3442381.3449890
- Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172 (2019).
- WristSense: Wrist-worn sensor device with camera for daily activity recognition. In 2012 IEEE International Conference on Pervasive Computing and Communications Workshops. 510–512. https://doi.org/10.1109/PerComW.2012.6197551
- SensIR: Detecting Hand Gestures with a Wearable Bracelet Using Infrared Transmission and Reflection. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (Québec City, QC, Canada) (UIST ’17). Association for Computing Machinery, New York, NY, USA, 593–597. https://doi.org/10.1145/3126594.3126604
- EchoFlex: Hand Gesture Recognition Using Ultrasound Imaging. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY, USA, 1923–1934. https://doi.org/10.1145/3025453.3025807
- Microsoft. 2022. Kinect for Windows. Retrieved Feb 12, 2023 from https://learn.microsoft.com/en-us/windows/apps/design/devices/kinect-for-windows
- SAMoSA: Sensing Activities with Motion and Subsampled Audio. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 3, Article 132 (sep 2022), 19 pages. https://doi.org/10.1145/3550284
- GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 49–59. https://doi.org/10.1109/CVPR.2018.00013
- FingerIO: Using Active Sonar for Fine-Grained Finger Tracking. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 1515–1525. https://doi.org/10.1145/2858036.2858580
- VibEye: Vibration-Mediated Object Recognition for Tangible Interactive Applications. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300906
- Privacy-Enhancing Technology and Everyday Augmented Reality: Understanding Bystanders’ Varying Needs for Awareness and Consent. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4, Article 177 (jan 2023), 35 pages. https://doi.org/10.1145/3569501
- Recognizing Activities of Daily Living with a Wrist-Mounted Camera. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3103–3111. https://doi.org/10.1109/CVPR.2016.338
- Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In 2011 International Conference on Computer Vision. 2088–2095. https://doi.org/10.1109/ICCV.2011.6126483
- Touch & Activate: Adding Interactivity to Existing Objects Using Active Acoustic Sensing. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (St. Andrews, Scotland, United Kingdom) (UIST ’13). Association for Computing Machinery, New York, NY, USA, 31–40. https://doi.org/10.1145/2501988.2501989
- AuraRing: Precise Electromagnetic Finger Tracking. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 4, Article 150 (sep 2020), 28 pages. https://doi.org/10.1145/3369831
- HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1486–1495. https://doi.org/10.1109/CVPR52688.2022.00155
- Acceleration sensing glove (ASG). 178 – 180. https://doi.org/10.1109/ISWC.1999.806717
- Translating sEMG signals to continuous hand poses using recurrent neural networks. In 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI). 166–169. https://doi.org/10.1109/BHI.2018.8333395
- Emg Acquisition and Hand Pose Classification for Bionic Hands from Randomly-Placed Sensors. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Calgary, AB, Canada). IEEE Press, 1105–1109. https://doi.org/10.1109/ICASSP.2018.8462409
- Jun Rekimoto. 2001. GestureWrist and GesturePad: unobtrusive wearable interactiondevices. International Symposium on Wearable Computers, Digest of Papers, 21–27. https://doi.org/10.1109/ISWC.2001.962092
- A database for fine grained activity detection of cooking activities. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. 1194–1201. https://doi.org/10.1109/CVPR.2012.6247801
- AudioGest: Enabling Fine-Grained Hand Gesture Detection by Decoding Echo Signal. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Heidelberg, Germany) (UbiComp ’16). Association for Computing Machinery, New York, NY, USA, 474–485. https://doi.org/10.1145/2971648.2971736
- Sensing Hand Interactions with Everyday Objects by Profiling Wrist Topography. In Sixteenth International Conference on Tangible, Embedded, and Embodied Interaction (Daejeon, Republic of Korea) (TEI ’22). Association for Computing Machinery, New York, NY, USA, Article 14, 14 pages. https://doi.org/10.1145/3490149.3501320
- Real-Time Hand Gesture Recognition Using Temporal Muscle Activation Maps of Multi-Channel Semg Signals. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1299–1303. https://doi.org/10.1109/ICASSP40776.2020.9054227
- Continuous Gesture Recognition from sEMG Sensor Data with Recurrent Neural Networks and Adversarial Domain Adaptation. In 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV). IEEE. https://doi.org/10.1109/icarcv.2018.8581206
- VSkin: Sensing Touch Gestures on Surfaces of Mobile Devices Using Acoustic Signals. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (New Delhi, India) (MobiCom ’18). Association for Computing Machinery, New York, NY, USA, 591–605. https://doi.org/10.1145/3241539.3241568
- ThumbTrak: Recognizing Micro-finger Poses Using a Ring with Proximity Sensing. In Proceedings of the 23rd International Conference on Mobile Human-Computer Interaction. ACM. https://doi.org/10.1145/3447526.3472060
- Demo: Low-Power Capacitive Sensing Wristband for Hand Gesture Recognition. In Proceedings of the 9th ACM Workshop on Wireless of the Students, by the Students, and for the Students (Snowbird, Utah, USA) (S3 ’17). Association for Computing Machinery, New York, NY, USA, 21. https://doi.org/10.1145/3131348.3131358
- Capacitive Sensing 3D-Printed Wristband for Enriched Hand Gesture Recognition. In Proceedings of the 2017 Workshop on Wearable Systems and Applications (Niagara Falls, New York, USA) (WearSys ’17). Association for Computing Machinery, New York, NY, USA, 11–15. https://doi.org/10.1145/3089351.3089359
- CapBand: Battery-Free Successive Capacitance Sensing Wristband for Hand Gesture Recognition. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems (Shenzhen, China) (SenSys ’18). Association for Computing Machinery, New York, NY, USA, 54–67. https://doi.org/10.1145/3274783.3274854
- ThumbRing: Private Interactions Using One-Handed Thumb Motion Input on Finger Segments. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct (Florence, Italy) (MobileHCI ’16). Association for Computing Machinery, New York, NY, USA, 791–798. https://doi.org/10.1145/2957265.2961859
- UltraLeap. 2022. World-leading Hand Tracking Products: Small. Fast. Accurate. | Ultraleap. Retrieved Feb 12, 2023 from https://www.ultraleap.com/product/
- Z-Ring: Single-Point Bio-Impedance Sensing for Gesture, Touch, Object and User Recognition. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 150, 18 pages. https://doi.org/10.1145/3544548.3581422
- Z-Pose: Continuous 3D Hand Pose Tracking Using Single-Point Bio-Impedance Sensing on a Ring. In Proceedings of the 2nd Workshop on Smart Wearable Systems and Applications (Madrid, Spain) (SmartWear ’23). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3615592.3616851
- C-FMCW Based Contactless Respiration Detection Using Acoustic Signal. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 4, Article 170 (jan 2018), 20 pages. https://doi.org/10.1145/3161188
- Back-Hand-Pose: 3D Hand Pose Estimation for a Wrist-Worn Camera via Dorsum Deformation Network. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 1147–1160. https://doi.org/10.1145/3379337.3415897
- Finger-Writing with Smartwatch: A Case for Finger and Hand Gesture Recognition Using Smartwatch. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications (Santa Fe, New Mexico, USA) (HotMobile ’15). Association for Computing Machinery, New York, NY, USA, 9–14. https://doi.org/10.1145/2699343.2699350
- Enabling Hand Gesture Customization on Wrist-Worn Devices. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 496, 19 pages. https://doi.org/10.1145/3491102.3501904
- Opisthenar: Hand Poses and Finger Tapping Recognition by Observing Back of Hand Using Embedded Wrist Camera. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (New Orleans, LA, USA) (UIST ’19). Association for Computing Machinery, New York, NY, USA, 963–971. https://doi.org/10.1145/3332165.3347867
- EchoSpeech: Continuous Silent Speech Recognition on Minimally-Obtrusive Eyewear Powered by Acoustic Sensing. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 852, 18 pages. https://doi.org/10.1145/3544548.3580801
- EatingTrak: Detecting Fine-Grained Eating Moments in the Wild Using a Wrist-Mounted IMU. Proc. ACM Hum.-Comput. Interact. 6, MHCI, Article 214 (sep 2022), 22 pages. https://doi.org/10.1145/3546749
- Yang Zhang and Chris Harrison. 2015. Tomo: Wearable, Low-Cost Electrical Impedance Tomography for Hand Gesture Recognition. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machinery, New York, NY, USA, 167–173. https://doi.org/10.1145/2807442.2807480
- Advancing Hand Gesture Recognition with High Resolution Electrical Impedance Tomography. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 843–850. https://doi.org/10.1145/2984511.2984574
- Learning on the Rings: Self-Supervised 3D Finger Motion Tracking Using Wearable Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 90 (jul 2022), 31 pages. https://doi.org/10.1145/3534587
- Towards Automatic Learning of Procedures From Web Instructional Videos. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (April 2018). https://doi.org/10.1609/aaai.v32i1.12342
- Christian Zimmermann and Thomas Brox. 2017. Learning to Estimate 3D Hand Pose from Single RGB Images. In 2017 IEEE International Conference on Computer Vision (ICCV). 4913–4921. https://doi.org/10.1109/ICCV.2017.525