Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch (2407.16785v1)

Published 23 Jul 2024 in cs.HC and cs.AI

Abstract: We routinely perform procedures (such as cooking) that include a set of atomic steps. Often, inadvertent omission or misordering of a single step can lead to serious consequences, especially for those experiencing cognitive challenges such as dementia. This paper introduces PrISM-Observer, a smartwatch-based, context-aware, real-time intervention system designed to support daily tasks by preventing errors. Unlike traditional systems that require users to seek out information, the agent observes user actions and intervenes proactively. This capability is enabled by the agent's ability to continuously update its belief in the user's behavior in real-time through multimodal sensing and forecast optimal intervention moments and methods. We first validated the steps-tracking performance of our framework through evaluations across three datasets with different complexities. Then, we implemented a real-time agent system using a smartwatch and conducted a user study in a cooking task scenario. The system generated helpful interventions, and we gained positive feedback from the participants. The general applicability of PrISM-Observer to daily tasks promises broad applications, for instance, including support for users requiring more involved interventions, such as people with dementia or post-surgical patients.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. 2013. Memory Lapse – Four Things Slip Our Mind Every Day. https://en.paperblog.com/memory-lapse-four-things-slip-our-mind-every-day-639158/
  2. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 3. https://doi.org/10.1145/3290605.3300233
  3. Apple. 2022a. Accelerate. https://developer.apple.com/documentation/accelerate
  4. Apple. 2022b. CoreML. https://developer.apple.com/documentation/coreml
  5. Apple. 2022. Handwashing on Apple Watch. https://support.apple.com/guide/watch/set-up-handwashing-apdc9b9f04a8/watchos
  6. Riku Arakawa and Hiromu Yakura. 2019. REsCUE: A framework for REal-time feedback on behavioral CUEs using multimodal anomaly detection. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, Scotland, UK, May 04-09, 2019. ACM, 572. https://doi.org/10.1145/3290605.3300802
  7. Riku Arakawa and Hiromu Yakura. 2021. Mindless Attractor: A False-Positive Resistant Intervention for Drawing Attention Using Auditory Perturbation. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021. ACM, 99:1–99:15. https://doi.org/10.1145/3411764.3445339
  8. PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and Uncertainty. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4 (2022), 156:1–156:27. https://doi.org/10.1145/3569504
  9. CHARM-Deep: Continuous Human Activity Recognition Model Based on Deep Neural Network Using IMU Sensors of Smartwatch. IEEE Sensors Journal 20, 15 (Aug. 2020), 8757–8770. https://doi.org/10.1109/jsen.2020.2985374
  10. Characterising omission errors in everyday task completion and cognitive correlates in individuals with mild cognitive impairment and dementia. Neuropsychological Rehabilitation 29, 5 (June 2017), 804–820. https://doi.org/10.1080/09602011.2017.1337039
  11. GestEar: combining audio and motion sensing for gesture recognition on smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers, UbiComp 2019, London, UK, September 09-13, 2019. ACM, 10–19. https://doi.org/10.1145/3341163.3347735
  12. Detecting errors in pick and place procedures: detecting errors in multi-stage and sequence-constrained manual retrieve-assembly procedures. In IUI ’20: 25th International Conference on Intelligent User Interfaces, Cagliari, Italy, March 17-20, 2020. ACM, 536–545. https://doi.org/10.1145/3377325.3377497
  13. Michael D. Byrne and Susan Bovair. 1997. A Working Memory Model of a Common Procedural Error. Cognitive Science 21, 1 (Jan. 1997), 31–61. https://doi.org/10.1207/s15516709cog2101_2
  14. Jiawen Chu. 2021. Recipe Bot: The Application of Conversational AI in Home Cooking Assistant. In 2021 2nd International Conference on Big Data; Artificial Intelligence; Software Engineering (ICBASE). IEEE. https://doi.org/10.1109/icbase53849.2021.00136
  15. Fred D. Davis. 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13, 3 (1989), 319–340. https://doi.org/10.2307/249008
  16. Cooking with Conversation: Enhancing User Engagement and Learning with a Knowledge-Enhancing Assistant. ACM Transactions on Information Systems (2024).
  17. Wayne D. Gray and Deborah A. Boehm-Davis. 2000. Milliseconds matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experimental Psychology: Applied 6, 4 (2000), 322–335. https://doi.org/10.1037/1076-898x.6.4.322
  18. Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition using Wearables. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2 (2017), 11:1–11:28. https://doi.org/10.1145/3090076
  19. Cooking navi: assistant for daily cooking in kitchen. In Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005. ACM, 371–374. https://doi.org/10.1145/1101149.1101228
  20. AR Cooking: Comparing Display Methods for the Instructions of Cookwares on AR Goggles. In Human Interface and the Management of Information. Information in Intelligent Systems - Thematic Area, HIMI 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26-31, 2019, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 11570). Springer, 127–140. https://doi.org/10.1007/978-3-030-22649-7_11
  21. The effect of cognitive load and time stress on prospective memory and its components. Current Psychology 43, 2 (Feb. 2023), 1670–1684. https://doi.org/10.1007/s12144-023-04354-1
  22. AdapTutAR: An adaptive tutoring system for machine tasks in augmented reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 417:1–417:15. https://doi.org/10.1145/3411764.3445283
  23. Cooking With Agents: Designing Context-aware Voice Interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI 2024, Honolulu, HI, USA, May 11-16, 2024. ACM, 551:1–551:13. https://doi.org/10.1145/3613904.3642183
  24. Supporting everyday activities in dementia: An intervention study. International Journal of Geriatric Psychiatry 8, 5 (May 1993), 395–400. https://doi.org/10.1002/gps.930080505
  25. Technology options to help people with dementia or acquired cognitive impairment perform multistep daily tasks: a scoping review. Journal of Enabling Technologies 15, 3 (May 2021), 208–223. https://doi.org/10.1108/jet-11-2020-0048
  26. Ubicoustics: Plug-and-play acoustic activity recognition. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. ACM, New York, NY, 213–224. https://doi.org/10.1145/3242587.3242609
  27. ViBand: High-fidelity bio-acoustic sensing using commodity smartwatch accelerometers. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, New York, NY, 321–333. https://doi.org/10.1145/2984511.2984582
  28. A Novel and Intelligent Home Monitoring System for Care Support of Elders with Cognitive Impairment. Journal of Alzheimer’s Disease 54, 4 (Oct. 2016), 1561–1591. https://doi.org/10.3233/jad-160348
  29. Matthew L. Lee and Anind K. Dey. 2014. Real-time feedback for improving medication taking. In CHI Conference on Human Factors in Computing Systems, CHI’14, Toronto, ON, Canada - April 26 - May 01, 2014. ACM, 2259–2268. https://doi.org/10.1145/2556288.2557210
  30. Lowell S Levin and Ellen L Idler. 1983. Self-care in health. Annual review of public health 4, 1 (1983), 181–201.
  31. Challenges with real-world smartwatch based audio monitoring. In Proceedings of the 4th ACM Workshop on Wearable Systems and Applications, WearSys@MobiSys 2018, Munich, Germany, June 10, 2018. ACM, 54–59. https://doi.org/10.1145/3211960.3211977
  32. InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied Demonstration. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023. ACM, 32:1–32:17. https://doi.org/10.1145/3544548.3581442
  33. Nicholas Metropolis and Stanislaw Ulam. 1949. The monte carlo method. Journal of the American statistical association 44, 247 (1949), 335–341.
  34. SAMoSA: Sensing Activities with Motion and Subsampled Audio. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 3 (2022), 132:1–132:19. https://doi.org/10.1145/3550284
  35. Cooking procedure recognition and inference in sensor embedded kitchen. In Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, New York, NY, 593–600. https://doi.org/10.1109/ROMAN.2009.5326050
  36. Medication adherence behaviors in older adults: Effects of external cognitive supports. Psychology and Aging 7, 2 (1992), 252–256. https://doi.org/10.1037/0882-7974.7.2.252
  37. VAX: Using Existing Video and Audio-based Activity Recognition Models to Bootstrap Privacy-Sensitive Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 7, 3 (2023), 117:1–117:24. https://doi.org/10.1145/3610907
  38. CaptainCook4D: A dataset for understanding errors in procedural activities. CoRR abs/2312.14556 (2023). https://doi.org/10.48550/ARXIV.2312.14556 arXiv:2312.14556
  39. Assessment of the Usability of SARS-CoV-2 Self Tests in a Peer-Assisted Model among Factory Workers in Bengaluru, India. (Nov. 2023). https://doi.org/10.1101/2023.11.20.23298784
  40. Step counter use in type 2 diabetes: a meta-analysis of randomized controlled trials. BMC medicine 12 (2014), 1–9.
  41. James Reason. 1990. Human error. Cambridge university press.
  42. A deep learning-based worker assistance system for error prevention: Case study in a real-world manual assembly. Advances in Production Engineering & Management 16, 4 (Dec. 2021), 393–404. https://doi.org/10.14743/apem2021.4.408
  43. MimiCook: A cooking assistant system with situated guidance. In Proceedings of the 8th International Conference on Tangible, Embedded, and Embodied Interaction. ACM, New York, NY, 121–124. https://doi.org/10.1145/2540930.2540952
  44. IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 4365–4374.
  45. Voice in Human-Agent Interaction: A Survey. ACM Comput. Surv. 54, 4 (2022), 81:1–81:43. https://doi.org/10.1145/3386867
  46. Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 21064–21074. https://doi.org/10.1109/CVPR52688.2022.02042
  47. Assembly Work Instruction Deployment Using Augmented Reality. Key Engineering Materials 502 (Feb. 2012), 25–30. https://doi.org/10.4028/www.scientific.net/kem.502.25
  48. Carrie L Shandra and Nihil Sonalkar. 2016. Health self-care in the United States. Public Health 138 (2016), 26–32. https://doi.org/10.1016/j.puhe.2016.02.030
  49. Accuracy of a step counter during treadmill and daily life walking by healthy adults and patients with cardiac disease. BMJ open 7, 3 (2017), e011742.
  50. Panavi: Recipe medium with a sensors-embedded pan for domestic users to master professional culinary arts. In Proceedings of the 2012 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 129–138. https://doi.org/10.1145/2207676.2207695
  51. Barriers to use of digital assistance for postoperative wound care: a single-center survey of dermatologic surgery patients. Archives of Dermatological Research 316, 7 (June 2024). https://doi.org/10.1007/s00403-024-03025-w
  52. A Learning-from-Observation Framework: One-Shot Robot Teaching for Grasp-Manipulation-Release Household Operations. In IEEE/SICE International Symposium on System Integration, SII 2021, Iwaki, Japan, January 11-14, 2021. IEEE, 461–466. https://doi.org/10.1109/IEEECONF49454.2021.9382750
  53. HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. IEEE, 20213–20224. https://doi.org/10.1109/ICCV51070.2023.01854
  54. A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions. Knowl. Based Syst. 223 (2021), 106970. https://doi.org/10.1016/J.KNOSYS.2021.106970
  55. Video-Annotated Augmented Reality Assembly Tutorials. In UIST ’20: The 33rd Annual ACM Symposium on User Interface Software and Technology, Virtual Event, USA, October 20-23, 2020. ACM, 1010–1022. https://doi.org/10.1145/3379337.3415819
  56. Yiran Zhang. 2023. The underlying reasons for making reminders: An investigation on memory offloading from the perspective of cognitive psychology. Journal of Education, Humanities and Social Sciences 8 (Feb. 2023), 2208–2213. https://doi.org/10.54097/ehss.v8i.4678
  57. Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, 2998–3012. https://doi.org/10.18653/V1/2022.ACL-LONG.214

Summary

  • The paper introduces PrISM-Observer, a smartwatch-based framework leveraging multimodal sensing and a stochastic model to predict user task behavior and provide context-aware interventions.
  • Validation studies showed the framework significantly reduced task step timing errors (exceeding 50% improvement in complex tasks) and achieved high user acceptance in a real-time cooking scenario.
  • PrISM-Observer provides a practical, privacy-aware approach for assisting with everyday tasks, offering implications for supporting users with cognitive challenges and enabling future adaptive systems.

Overview of "PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch"

The paper presents the PrISM-Observer, a context-aware, smartwatch-based intervention system designed to assist users in executing everyday procedural tasks with accuracy. This system operates by actively monitoring user actions through multimodal sensing and proactively intervening at optimal moments to prevent errors. The discussions within the paper address the prevalent issue of lapses in task execution due to inattention or cognitive challenges and propose a robust framework to mitigate these errors.

Key Contributions

This work provides a substantial contribution to the domain of Human-Computer Interaction (HCI) and ubiquitous computing by introducing a generalized intervention framework applicable to numerous routine tasks. The primary contributions of the research encompass:

  1. Framework for Modeling and Prediction: The PrISM-Observer leverages a novel stochastic model that forecasts user task behavior by considering both current sensor data and transition probabilities among task steps. This model aids in triggering context-sensitive interventions that are either reminders or notifications based on task specifics and user preferences.
  2. Multimodal Sensing Integration: By employing sound and motion sensors available on smartwatches, the system circumvents the need for intrusive or cumbersome equipment like cameras. This facilitates a seamless integration into users' daily routines, offering a practical and cost-effective solution with minimized privacy concerns.
  3. Practical and Theoretical Evaluation: The paper elaborates on two studies. Study 1 employs datasets from various procedural tasks to validate the framework's efficacy in predicting task step timings, while Study 2 demonstrates a real-time agent system's effectiveness in a controlled cooking task scenario. These studies highlight significant improvements in timing predictions and user acceptance of intervention notifications.

Numerical Outcomes and Validation

In Study 1, the framework demonstrated significantly reduced timing errors in task step predictions as task complexity increased. This finding underscores the utility of real-time sensing in refining forecast accuracy. Notable reductions were observed in complex tasks, with error improvements exceeding 50% in some cases.

Study 2 focused on real-world applicability, deploying the framework on an Apple Watch to assess user interaction and perception. The results reflected high accuracy in intervention timing and a positive user response concerning the system's reliability and usability, with most interventions perceived as accurate.

Limitations and Future Implications

Despite its promising results, PrISM-Observer encounters challenges primarily due to potential inaccuracies in sensor data leading to timing errors. Future research could explore improved methods for refining sensor accuracy and exploring error correction mechanisms, possibly incorporating user feedback to rectify system misjudgements. Additionally, the paper points towards prospectively adaptive systems that learn from user interactions over time to tailor interventions more precisely.

The implications of this research are profound, suggesting applications beyond current domains, such as supporting individuals with cognitive impairments or extending the framework to more intricate procedural environments like industrial assembly or healthcare compliance. The balance between machine adaptability and user customization represents a critical path for advancing pervasive computing systems in assisting varied user needs.

This paper contributes to the field by enhancing our understanding of contextual intervention systems while providing a scalable platform for future innovations in computational assistants that respect user autonomy and context-awareness.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com