Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EyeEcho: Continuous and Low-power Facial Expression Tracking on Glasses (2402.12388v2)

Published 13 Feb 2024 in cs.HC

Abstract: In this paper, we introduce EyeEcho, a minimally-obtrusive acoustic sensing system designed to enable glasses to continuously monitor facial expressions. It utilizes two pairs of speakers and microphones mounted on glasses, to emit encoded inaudible acoustic signals directed towards the face, capturing subtle skin deformations associated with facial expressions. The reflected signals are processed through a customized machine-learning pipeline to estimate full facial movements. EyeEcho samples at 83.3 Hz with a relatively low power consumption of 167 mW. Our user study involving 12 participants demonstrates that, with just four minutes of training data, EyeEcho achieves highly accurate tracking performance across different real-world scenarios, including sitting, walking, and after remounting the devices. Additionally, a semi-in-the-wild study involving 10 participants further validates EyeEcho's performance in naturalistic scenarios while participants engage in various daily activities. Finally, we showcase EyeEcho's potential to be deployed on a commercial-off-the-shelf (COTS) smartphone, offering real-time facial expression tracking.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. Meta AI. 2022. PyTorch Mobile - Home | PyTorch. Retrieved Aug 19, 2022 from https://pytorch.org/mobile/home/
  2. Canalsense: Face-related movement recognition system based on sensing air pressure in ear canals. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 679–689.
  3. DopLink: using the doppler effect for multi-device interaction. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing. 583–586.
  4. BOGO. 2023. OWR-05049T-38D. Retrieved Sept 13, 2023 from https://www.bogosemi-ca.com/products/B01976428/OWR-05049T-38D.html
  5. NeckFace: Continuously Tracking Full Facial Expressions on Neck-mounted Wearables. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1–31.
  6. C-Face: Continuously Reconstructing Facial Expressions by Deep Learning Contours of the Face with Ear-mounted Miniature Cameras. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 112–125.
  7. Bose Corporation. 2023. Bose Frames Tempo. Retrieved Sept 14, 2023 from https://www.bose.com/p/headphones/bose-frames-tempo/TEMPO-FRAMES.html?dwvar_TEMPO-FRAMES_color=BLACK&quantity=1
  8. Artem Dementyev and Christian Holz. 2017. DualBlink: A Wearable Device to Continuously Detect, Track, and Actuate Blinking For Alleviating Dry Eyes and Computer Vision Syndrome. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 1, Article 1 (2017), 19 pages. https://doi.org/10.1145/3053330
  9. Epson. 2022. Moverio® BT-35ES Smart Glasses. Retrieved Aug 19, 2022 from https://mediaserver.goepson.com/ImConvServlet/imconv/b1cac7eaccf8017600cf8e0ac112f5403b86e4de/original?assetDescr=Moverio_BT-35ES_Glasses_and_Intelligent_Controller_Specification_Sheet_CPD-60652R1.pdf
  10. SonicFace: Tracking Facial Expressions Using a Commodity Microphone Array. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 4, Article 156 (2022), 33 pages. https://doi.org/10.1145/3494988
  11. EarEcho: Using Ear Canal Echo for Wearable Authentication. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 3, Article 81 (sep 2019), 24 pages. https://doi.org/10.1145/3351239
  12. Anna Gruebler and Kenji Suzuki. 2010. Measurement of distal EMG signals using a wearable device for reading facial expressions. In Annual International Conference of the IEEE Engineering in Medicine and Biology. IEEE, 4594–4597.
  13. Estimation of room temperature based on acoustic frequency response. Acoustic Science and Technology 41, 4 (2020), 693–696. https://www.jstage.jst.go.jp/article/ast/41/4/41_E1954/_pdf/-char/ja#:~:text=The%20frequency%20response%20exhibits%20many,dip%20frequencies%20change%20with%20temperature.
  14. Facial expression recognition using deep Boltzmann machine from thermal infrared images. In Humaine Association Conference on Affective Computing and Intelligent Interaction. 239–244.
  15. A review of current airborne ultrasound exposure limits. The Journal of Occupational Health and Safety - Australia and New Zealand 21 (01 2005), 253–257.
  16. Unconstrained realtime facial performance capture. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1675–1683.
  17. Demo Abstract: Wireless Glasses for Non-contact Facial Expression Monitoring. In ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). 367–368. https://doi.org/10.1109/IPSN48710.2020.000-1
  18. WristAcoustic: Through-Wrist Acoustic Response Based Authentication for Smartwatches. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4, Article 167 (jan 2023), 34 pages. https://doi.org/10.1145/3569473
  19. Earnest Paul Ijjina and C Krishna Mohan. 2014. Facial expression recognition using kinect depth sensor and convolutional neural networks. In International Conference on Machine Learning and Applications. 392–396.
  20. Apple Inc. 2022. Tracking and Visualizing Faces | Apple Developer Documentation. Retrieved Aug 19, 2022 from https://developer.apple.com/documentation/arkit/content_anchors/tracking_and_visualizing_faces
  21. InvenSense. 2022. ICS-43434 | TDK. Retrieved Aug 19, 2022 from https://invensense.tdk.com/products/ics-43434/
  22. Interferi: Gesture Sensing Using On-Body Acoustic Interferometry. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–13.
  23. Shunsuke Iwakiri and Kazuya Murao. 2023. User Authentication Method for Wearable Ring Devices using Active Acoustic Sensing. In Proceedings of the 2023 ACM International Symposium on Wearable Computers (Cancun, Quintana Roo, Mexico) (ISWC ’23). Association for Computing Machinery, New York, NY, USA, 17–21. https://doi.org/10.1145/3594738.3611357
  24. Combining modality specific deep neural networks for emotion recognition in video. In Proceedings of the ACM on International Conference on Multimodal Interaction. 543–550.
  25. Daehwa Kim and Chris Harrison. 2023. Pantœnna: Mouth Pose Estimation for VR/AR Headsets Using Low-Profile Antenna and Impedance Characteristic Sensing. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST) (UIST ’23). Article 83, 12 pages. https://doi.org/10.1145/3586183.3606805
  26. Davis E King. 2009. Dlib-ml: A machine learning toolkit. In The Journal of Machine Learning Research, Vol. 10. 1755–1758.
  27. Knowles. 2022. SR6438NWS-000. Retrieved Aug 19, 2022 from https://www.knowles.com/docs/default-source/model-downloads/sr6438nws-000.pdf
  28. Emotion Recognition Using a Glasses-Type Wearable Device via Multi-Channel Facial Responses. In IEEE Access, Vol. 9. 146392–146403. https://doi.org/10.1109/ACCESS.2021.3121543
  29. Ying-Hsiu Lai and Shang-Hong Lai. 2018. Emotion-preserving representation learning via generative adversarial network for multi-view facial expression recognition. In IEEE International Conference on Automatic Face & Gesture Recognition (FG). 263–270.
  30. Facial Expressions as Game Input with Different Emotional Feedback Conditions. In Proceedings of the International Conference on Advances in Computer Entertainment Technology. 253––256. https://doi.org/10.1145/1501750.1501809
  31. Lenovo. 2022. ThinkReality A3 Smart Glasses. Retrieved Aug 19, 2022 from https://www.lenovo.com/us/en/p/smart-devices/virtual-reality/thinkreality-a3/wmd00000500
  32. EarIO: A Low-Power Acoustic Sensing Earable for Continuously Tracking Detailed Facial Movements. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 6. Article 62, 24 pages.
  33. EchoSpot: Spotting Your Locations via Acoustic Sensing. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1–21.
  34. BlinkListener: "Listen" to Your Eye Blink Using Your Smartphone. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 2, Article 73 (2021), 27 pages. https://doi.org/10.1145/3463521
  35. Non-Contact, Real-Time Eye Blink Detection with Capacitive Sensing. In Proceedings of the 2022 ACM International Symposium on Wearable Computers (ISWC ’22). 49–53. https://doi.org/10.1145/3544794.3558462
  36. Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1749–1756.
  37. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1805–1812.
  38. Lip reading-based user authentication through acoustic sensing on smartphones. In IEEE/ACM Transactions on Networking (TON), Vol. 27. 447–460.
  39. PoseSonic: 3D Upper Body Pose Estimation Through Egocentric Acoustic Sensing on Smartglasses. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 7, 3, Article 111 (sep 2023), 28 pages. https://doi.org/10.1145/3610895
  40. Facial Expression Recognition in Daily Life by Embedded Photo Reflective Sensors on Smart Eyewear. In Proceedings of the International Conference on Intelligent User Interfaces (IUI). 317––326. https://doi.org/10.1145/2856767.2856770
  41. <i>EarFieldSensing</i>: A Novel In-Ear Electric Field Sensing to Enrich Wearable Gesture Input through Facial Expressions. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1911––1922. https://doi.org/10.1145/3025453.3025692
  42. A Proposed Approach for Biometric-Based Authentication Using of Face and Facial Expression Recognition. In IEEE International Conference on Communication and Information Systems (ICCIS). 28–33. https://doi.org/10.1109/ICOMIS.2018.8644974
  43. Contactless sleep apnea detection on smartphones. In Proceedings of the Annual International Conference on Mobile Systems, Applications, and Services. 45–57.
  44. Fingerio: Using active sonar for fine-grained finger tracking. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1515–1525.
  45. Niora. 2022. Microsoft HoloLens - Review - Full specification - Where to buy? Retrieved Aug 19, 2022 from https://www.niora.net/en/p/microsoft_hololens
  46. Amazon.com Inc. or its affiliates. 2022. Echo Frames Battery Life and Testing Information - Amazon Customer Service. Retrieved Aug 19, 2022 from https://www.amazon.com/gp/help/customer/display.html?nodeId=GSVK3ZY3G43K435E
  47. Jinhwan Park and Sehyun Baek. 2019. Dry eye syndrome in thyroid eye disease patients: The role of increased incomplete blinking and Meibomian gland loss. Acta ophthalmologica [Acta Ophthalmol] 97, 5 (2019), e800–e806. https://research-ebsco-com.proxy.library.cornell.edu/c/u2yil2/details/bmwiotplxn
  48. Video-based self-review: comparing Google Glass and GoPro technologies. Annals of plastic surgery 74 (2015), S71–S74.
  49. Capacitive facial movement detection for human–computer interaction to click by frowning and lifting eyebrows. In Medical & biological engineering & computing, Vol. 48. Springer, 39–47.
  50. On deep generative models with applications to recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2857–2864.
  51. Santiago Rayes. 2022. What are microphone environmental coefficients? Retrieved Dec 6, 2023 from https://www.grasacoustics.com/blog/working-with-environmental-coefficients#:~:text=Higher%20frequencies%20(above%20about%206,frequency%20response%20depending%20on%20temperature.
  52. Disentangling factors of variation for facial expression recognition. In European Conference on Computer Vision. 808–822.
  53. Expression Glasses: A Wearable Device for Facial Expression Recognition. In CHI Extended Abstracts on Human Factors in Computing Systems. 262––263. https://doi.org/10.1145/632716.632878
  54. John Scott-Thomas. 2023. Ray-Ban Stories smart glasses use two frame mounted cameras to capture images in a first step towards Augmented Reality. Retrieved Sept 13, 2023 from https://www.techinsights.com/blog/ray-ban-stories-smart-glasses-cameras
  55. Authentic facial expression analysis. In Image and Vision Computing, Vol. 25. 1856–1863.
  56. Nordic Semiconductor. 2022a. Bluetooth Low Energy data throughput - Nordic Semiconductor Infocenter. Retrieved Aug 19, 2022 from https://infocenter.nordicsemi.com/index.jsp?topic=%2Fsds_s140%2FSDS%2Fs1xx%2Fble_data_throughput%2Fble_data_throughput.html&cp=4_7_4_0_16
  57. Nordic Semiconductor. 2022b. nRF52840 - Bluetooth 5.2 SoC - nordicsemi.com. Retrieved Aug 19, 2022 from https://www.nordicsemi.com/Products/nRF52840
  58. Vskin: Sensing touch gestures on surfaces of mobile devices using acoustic signals. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 591–605.
  59. EchoNose: Sensing Mouth, Breathing and Tongue Gestures inside Oral Cavity using a Non-contact Nose Interface. In Proceedings of the 2023 ACM International Symposium on Wearable Computers (Cancun, Quintana Roo, Mexico) (ISWC ’23). Association for Computing Machinery, New York, NY, USA, 22–26. https://doi.org/10.1145/3594738.3611358
  60. Real-time expression transfer for facial reenactment. In ACM Transactions on Graphics, Vol. 34. Article 183, 14 pages.
  61. ExpressEar: Sensing Fine-Grained Facial Expressions with Earables. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. 1–28.
  62. The significance of facial features for automatic sign language recognition. In IEEE International Conference on Automatic Face & Gesture Recognition. 1–6. https://doi.org/10.1109/AFGR.2008.4813472
  63. C-FMCW based contactless respiration detection using acoustic signal. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 1. 1–20.
  64. Device-free gesture tracking using acoustic signals. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 82–94.
  65. ToothSonic: Earable Authentication via Acoustic Toothprint. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 78 (jul 2022), 24 pages. https://doi.org/10.1145/3534606
  66. VR Facial Animation via Multiview Image Translation. In ACM Transactions on Graphics, Vol. 38. Article 67, 16 pages. https://doi.org/10.1145/3306346.3323030
  67. Wikipedia. 2023. Google Glass. Retrieved Sept 14, 2023 from https://en.wikipedia.org/wiki/Google_Glass
  68. SG Wireless. 2022. SGW111X BLE Modules. Retrieved Aug 19, 2022 from https://www.sgwireless.com/product/SGW111X
  69. Look at boundary: A boundary-aware face alignment algorithm. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2129–2138.
  70. BioFace-3D: Continuous 3d Facial Reconstruction through Lightweight Single-Ear Biosensors. In Proceedings of the Annual International Conference on Mobile Computing and Networking (MobiCom). 350–363.
  71. mm3DFace: Nonintrusive 3D Facial Reconstruction Leveraging mmWave Signals. In Proceedings of the Annual International Conference on Mobile Systems, Applications and Services (MobiSys) (MobiSys ’23). 462–474. https://doi.org/10.1145/3581791.3596839
  72. Acoustic-Based Upper Facial Action Recognition for Smart Eyewear. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 5. Article 41, 28 pages. https://doi.org/10.1145/3448105
  73. IBlink: Smart Glasses for Facial Paralysis Patients. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys ’17). 359–370. https://doi.org/10.1145/3081333.3081343
  74. Xreal. 2024. Xreal Air 2 Ultra. Retrieved Feb 13, 2024 from https://developer.xreal.com/?lang=en
  75. EarBuddy: Enabling On-Face Interaction via Wireless Earbuds. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–14.
  76. Strata: Fine-grained acoustic-based device-free tracking. In Proceedings of the Annual International Conference on Mobile Systems, Applications, and Services. 15–28.
  77. FingerPing: Recognizing fine-grained hand poses using active acoustic on-body sensing. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–10.
  78. HPSpeech: Silent Speech Interface for Commodity Headphones. In Proceedings of the 2023 ACM International Symposium on Wearable Computers (Cancun, Quintana Roo, Mexico) (ISWC ’23). Association for Computing Machinery, New York, NY, USA, 60–65. https://doi.org/10.1145/3594738.3611365
  79. EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic Sensing. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (<conf-loc>, <city>Hamburg</city>, <country>Germany</country>, </conf-loc>) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 852, 18 pages. https://doi.org/10.1145/3544548.3580801
  80. I Am an Earphone and I Can Hear My Users Face: Facial Landmark Tracking Using Smart Earphones. ACM Trans. Internet Things (TIOT) (Aug 2023). https://doi.org/10.1145/3614438
  81. Endophasia: Utilizing Acoustic-Based Imaging for Issuing Contact-Free Silent Speech Commands. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Vol. 4. 1–26.
  82. Vernier: Accurate and fast acoustic motion tracking using mobile devices. In IEEE International Conference on Computer Communications (INFOCOM). 1709–1717.
Citations (10)

Summary

We haven't generated a summary for this paper yet.