PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch (2407.16785v1)
Abstract: We routinely perform procedures (such as cooking) that include a set of atomic steps. Often, inadvertent omission or misordering of a single step can lead to serious consequences, especially for those experiencing cognitive challenges such as dementia. This paper introduces PrISM-Observer, a smartwatch-based, context-aware, real-time intervention system designed to support daily tasks by preventing errors. Unlike traditional systems that require users to seek out information, the agent observes user actions and intervenes proactively. This capability is enabled by the agent's ability to continuously update its belief in the user's behavior in real-time through multimodal sensing and forecast optimal intervention moments and methods. We first validated the steps-tracking performance of our framework through evaluations across three datasets with different complexities. Then, we implemented a real-time agent system using a smartwatch and conducted a user study in a cooking task scenario. The system generated helpful interventions, and we gained positive feedback from the participants. The general applicability of PrISM-Observer to daily tasks promises broad applications, for instance, including support for users requiring more involved interventions, such as people with dementia or post-surgical patients.
- 2013. Memory Lapse – Four Things Slip Our Mind Every Day. https://en.paperblog.com/memory-lapse-four-things-slip-our-mind-every-day-639158/
- Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 3. https://doi.org/10.1145/3290605.3300233
- Apple. 2022a. Accelerate. https://developer.apple.com/documentation/accelerate
- Apple. 2022b. CoreML. https://developer.apple.com/documentation/coreml
- Apple. 2022. Handwashing on Apple Watch. https://support.apple.com/guide/watch/set-up-handwashing-apdc9b9f04a8/watchos
- Riku Arakawa and Hiromu Yakura. 2019. REsCUE: A framework for REal-time feedback on behavioral CUEs using multimodal anomaly detection. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, Scotland, UK, May 04-09, 2019. ACM, 572. https://doi.org/10.1145/3290605.3300802
- Riku Arakawa and Hiromu Yakura. 2021. Mindless Attractor: A False-Positive Resistant Intervention for Drawing Attention Using Auditory Perturbation. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021. ACM, 99:1–99:15. https://doi.org/10.1145/3411764.3445339
- PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and Uncertainty. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4 (2022), 156:1–156:27. https://doi.org/10.1145/3569504
- CHARM-Deep: Continuous Human Activity Recognition Model Based on Deep Neural Network Using IMU Sensors of Smartwatch. IEEE Sensors Journal 20, 15 (Aug. 2020), 8757–8770. https://doi.org/10.1109/jsen.2020.2985374
- Characterising omission errors in everyday task completion and cognitive correlates in individuals with mild cognitive impairment and dementia. Neuropsychological Rehabilitation 29, 5 (June 2017), 804–820. https://doi.org/10.1080/09602011.2017.1337039
- GestEar: combining audio and motion sensing for gesture recognition on smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers, UbiComp 2019, London, UK, September 09-13, 2019. ACM, 10–19. https://doi.org/10.1145/3341163.3347735
- Detecting errors in pick and place procedures: detecting errors in multi-stage and sequence-constrained manual retrieve-assembly procedures. In IUI ’20: 25th International Conference on Intelligent User Interfaces, Cagliari, Italy, March 17-20, 2020. ACM, 536–545. https://doi.org/10.1145/3377325.3377497
- Michael D. Byrne and Susan Bovair. 1997. A Working Memory Model of a Common Procedural Error. Cognitive Science 21, 1 (Jan. 1997), 31–61. https://doi.org/10.1207/s15516709cog2101_2
- Jiawen Chu. 2021. Recipe Bot: The Application of Conversational AI in Home Cooking Assistant. In 2021 2nd International Conference on Big Data; Artificial Intelligence; Software Engineering (ICBASE). IEEE. https://doi.org/10.1109/icbase53849.2021.00136
- Fred D. Davis. 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13, 3 (1989), 319–340. https://doi.org/10.2307/249008
- Cooking with Conversation: Enhancing User Engagement and Learning with a Knowledge-Enhancing Assistant. ACM Transactions on Information Systems (2024).
- Wayne D. Gray and Deborah A. Boehm-Davis. 2000. Milliseconds matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experimental Psychology: Applied 6, 4 (2000), 322–335. https://doi.org/10.1037/1076-898x.6.4.322
- Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition using Wearables. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2 (2017), 11:1–11:28. https://doi.org/10.1145/3090076
- Cooking navi: assistant for daily cooking in kitchen. In Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005. ACM, 371–374. https://doi.org/10.1145/1101149.1101228
- AR Cooking: Comparing Display Methods for the Instructions of Cookwares on AR Goggles. In Human Interface and the Management of Information. Information in Intelligent Systems - Thematic Area, HIMI 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26-31, 2019, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 11570). Springer, 127–140. https://doi.org/10.1007/978-3-030-22649-7_11
- The effect of cognitive load and time stress on prospective memory and its components. Current Psychology 43, 2 (Feb. 2023), 1670–1684. https://doi.org/10.1007/s12144-023-04354-1
- AdapTutAR: An adaptive tutoring system for machine tasks in augmented reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 417:1–417:15. https://doi.org/10.1145/3411764.3445283
- Cooking With Agents: Designing Context-aware Voice Interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI 2024, Honolulu, HI, USA, May 11-16, 2024. ACM, 551:1–551:13. https://doi.org/10.1145/3613904.3642183
- Supporting everyday activities in dementia: An intervention study. International Journal of Geriatric Psychiatry 8, 5 (May 1993), 395–400. https://doi.org/10.1002/gps.930080505
- Technology options to help people with dementia or acquired cognitive impairment perform multistep daily tasks: a scoping review. Journal of Enabling Technologies 15, 3 (May 2021), 208–223. https://doi.org/10.1108/jet-11-2020-0048
- Ubicoustics: Plug-and-play acoustic activity recognition. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. ACM, New York, NY, 213–224. https://doi.org/10.1145/3242587.3242609
- ViBand: High-fidelity bio-acoustic sensing using commodity smartwatch accelerometers. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, New York, NY, 321–333. https://doi.org/10.1145/2984511.2984582
- A Novel and Intelligent Home Monitoring System for Care Support of Elders with Cognitive Impairment. Journal of Alzheimer’s Disease 54, 4 (Oct. 2016), 1561–1591. https://doi.org/10.3233/jad-160348
- Matthew L. Lee and Anind K. Dey. 2014. Real-time feedback for improving medication taking. In CHI Conference on Human Factors in Computing Systems, CHI’14, Toronto, ON, Canada - April 26 - May 01, 2014. ACM, 2259–2268. https://doi.org/10.1145/2556288.2557210
- Lowell S Levin and Ellen L Idler. 1983. Self-care in health. Annual review of public health 4, 1 (1983), 181–201.
- Challenges with real-world smartwatch based audio monitoring. In Proceedings of the 4th ACM Workshop on Wearable Systems and Applications, WearSys@MobiSys 2018, Munich, Germany, June 10, 2018. ACM, 54–59. https://doi.org/10.1145/3211960.3211977
- InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied Demonstration. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023. ACM, 32:1–32:17. https://doi.org/10.1145/3544548.3581442
- Nicholas Metropolis and Stanislaw Ulam. 1949. The monte carlo method. Journal of the American statistical association 44, 247 (1949), 335–341.
- SAMoSA: Sensing Activities with Motion and Subsampled Audio. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 3 (2022), 132:1–132:19. https://doi.org/10.1145/3550284
- Cooking procedure recognition and inference in sensor embedded kitchen. In Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, New York, NY, 593–600. https://doi.org/10.1109/ROMAN.2009.5326050
- Medication adherence behaviors in older adults: Effects of external cognitive supports. Psychology and Aging 7, 2 (1992), 252–256. https://doi.org/10.1037/0882-7974.7.2.252
- VAX: Using Existing Video and Audio-based Activity Recognition Models to Bootstrap Privacy-Sensitive Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 7, 3 (2023), 117:1–117:24. https://doi.org/10.1145/3610907
- CaptainCook4D: A dataset for understanding errors in procedural activities. CoRR abs/2312.14556 (2023). https://doi.org/10.48550/ARXIV.2312.14556 arXiv:2312.14556
- Assessment of the Usability of SARS-CoV-2 Self Tests in a Peer-Assisted Model among Factory Workers in Bengaluru, India. (Nov. 2023). https://doi.org/10.1101/2023.11.20.23298784
- Step counter use in type 2 diabetes: a meta-analysis of randomized controlled trials. BMC medicine 12 (2014), 1–9.
- James Reason. 1990. Human error. Cambridge university press.
- A deep learning-based worker assistance system for error prevention: Case study in a real-world manual assembly. Advances in Production Engineering & Management 16, 4 (Dec. 2021), 393–404. https://doi.org/10.14743/apem2021.4.408
- MimiCook: A cooking assistant system with situated guidance. In Proceedings of the 8th International Conference on Tangible, Embedded, and Embodied Interaction. ACM, New York, NY, 121–124. https://doi.org/10.1145/2540930.2540952
- IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 4365–4374.
- Voice in Human-Agent Interaction: A Survey. ACM Comput. Surv. 54, 4 (2022), 81:1–81:43. https://doi.org/10.1145/3386867
- Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 21064–21074. https://doi.org/10.1109/CVPR52688.2022.02042
- Assembly Work Instruction Deployment Using Augmented Reality. Key Engineering Materials 502 (Feb. 2012), 25–30. https://doi.org/10.4028/www.scientific.net/kem.502.25
- Carrie L Shandra and Nihil Sonalkar. 2016. Health self-care in the United States. Public Health 138 (2016), 26–32. https://doi.org/10.1016/j.puhe.2016.02.030
- Accuracy of a step counter during treadmill and daily life walking by healthy adults and patients with cardiac disease. BMJ open 7, 3 (2017), e011742.
- Panavi: Recipe medium with a sensors-embedded pan for domestic users to master professional culinary arts. In Proceedings of the 2012 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 129–138. https://doi.org/10.1145/2207676.2207695
- Barriers to use of digital assistance for postoperative wound care: a single-center survey of dermatologic surgery patients. Archives of Dermatological Research 316, 7 (June 2024). https://doi.org/10.1007/s00403-024-03025-w
- A Learning-from-Observation Framework: One-Shot Robot Teaching for Grasp-Manipulation-Release Household Operations. In IEEE/SICE International Symposium on System Integration, SII 2021, Iwaki, Japan, January 11-14, 2021. IEEE, 461–466. https://doi.org/10.1109/IEEECONF49454.2021.9382750
- HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. IEEE, 20213–20224. https://doi.org/10.1109/ICCV51070.2023.01854
- A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions. Knowl. Based Syst. 223 (2021), 106970. https://doi.org/10.1016/J.KNOSYS.2021.106970
- Video-Annotated Augmented Reality Assembly Tutorials. In UIST ’20: The 33rd Annual ACM Symposium on User Interface Software and Technology, Virtual Event, USA, October 20-23, 2020. ACM, 1010–1022. https://doi.org/10.1145/3379337.3415819
- Yiran Zhang. 2023. The underlying reasons for making reminders: An investigation on memory offloading from the perspective of cognitive psychology. Journal of Education, Humanities and Social Sciences 8 (Feb. 2023), 2208–2213. https://doi.org/10.54097/ehss.v8i.4678
- Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, 2998–3012. https://doi.org/10.18653/V1/2022.ACL-LONG.214