Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inferring Human Intentions from Predicted Action Probabilities (2308.12194v2)

Published 23 Aug 2023 in cs.HC

Abstract: Predicting the next action that a human is most likely to perform is key to human-AI collaboration and has consequently attracted increasing research interests in recent years. An important factor for next action prediction are human intentions: If the AI agent knows the intention it can predict future actions and plan collaboration more effectively. Existing Bayesian methods for this task struggle with complex visual input while deep neural network (DNN) based methods do not provide uncertainty quantifications. In this work we combine both approaches for the first time and show that the predicted next action probabilities contain information that can be used to infer the underlying intention. We propose a two-step approach to human intention prediction: While a DNN predicts the probabilities of the next action, MCMC-based Bayesian inference is used to infer the underlying intention from these predictions. This approach not only allows for independent design of the DNN architecture but also the subsequently fast, design-independent inference of human intentions. We evaluate our method using a series of experiments on the Watch-And-Help (WAH) and a keyboard and mouse interaction dataset. Our results show that our approach can accurately predict human intentions from observed actions and the implicit information contained in next action probabilities. Furthermore, we show that our approach can predict the correct intention even if only few actions have been observed.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Bayesian models for keyhole plan recognition in an adventure game. User modeling and user-adapted interaction 8 (1998), 5–47.
  2. Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nature Human Behaviour 1, 4 (2017), 0064.
  3. What do you want to do next: a novel approach for intent prediction in gaze-based interaction. In Proceedings of the symposium on eye tracking research and applications. 83–90.
  4. Knowledge distillation for action anticipation via label smoothing. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 3312–3319.
  5. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1 (2017), 1–32.
  6. Intentnet: Learning to predict intention from raw sensor data. In Conference on Robot Learning. PMLR, 947–956.
  7. Procedure planning in instructional videos. In European Conference on Computer Vision. Springer, 334–350.
  8. Antonino Furnari and Giovanni Maria Farinella. 2020. Rolling-unrolling lstms for action anticipation from first-person video. IEEE transactions on pattern analysis and machine intelligence 43, 11 (2020), 4021–4036.
  9. Predicting the future: A jointly learnt model for action anticipation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5562–5571.
  10. End-to-end prediction of driver intention using 3d convolutional neural networks. In 2019 IEEE Intelligent vehicles symposium (IV). IEEE, 969–974.
  11. Rohit Girdhar and Kristen Grauman. 2021. Anticipative video transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13505–13515.
  12. Using gaze patterns to predict task intent in collaboration. Frontiers in psychology 6 (2015), 1049.
  13. Siddarth Jain and Brenna Argall. 2019. Probabilistic human intent recognition for shared autonomy in assistive robotics. ACM Transactions on Human-Robot Interaction (THRI) 9, 1 (2019), 1–23.
  14. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474 (2017).
  15. Anticipatory planning for human-robot teams. In Experimental robotics. Springer, 453–470.
  16. Goal Recognition for Deceptive Human Agents through Planning and Gaze. Journal of Artificial Intelligence Research 71 (2021), 697–732.
  17. Unified Intention Inference and Learning for Human–Robot Cooperative Assembly. IEEE Transactions on Automation Science and Engineering 19, 3 (2021), 2256–2266.
  18. Intentions and intentionality: Foundations of social cognition. MIT press.
  19. Watch-and-help: A challenge for social perception and human-ai collaboration. arXiv preprint arXiv:2010.09890 (2020).
  20. NOPA: Neurally-guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants. arXiv preprint arXiv:2301.05223 (2023).
  21. Self-regulated learning for egocentric video activity anticipation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
  22. User intent prediction in information-seeking conversations. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval. 25–33.
  23. Debaditya Roy and Basura Fernando. 2022. Action anticipation using latent goal learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2745–2753.
  24. Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling. In Proc. IEEE International Conference on Computer Vision Workshops (ICCVW). 2740–2748. https://doi.org/10.1109/ICCVW.2017.322
  25. Deep Gaze Pooling: Inferring and Visually Decoding Search Intents From Human Gaze Fixations. Neurocomputing 387 (2020), 369–382. https://doi.org/10.1016/j.neucom.2020.01.028
  26. Prediction of Search Targets From Fixations in Open-world Settings. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 981–990. https://doi.org/10.1109/CVPR.2015.7298700
  27. Can we infer player behavior tendencies from a player’s decision-making data? integrating theory of mind to player modeling. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 17. 195–202.
  28. Combining planning with gaze for online human intention recognition. In Proceedings of the 17th international conference on autonomous agents and multiagent systems. 488–496.
  29. Predicting human intention in visual observations of hand/object interactions. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 1608–1615.
  30. Neural Photofit: Gaze-based Mental Image Reconstruction. In Proc. IEEE International Conference on Computer Vision (ICCV). 245–254. https://doi.org/10.1109/ICCV48922.2021.00031
  31. Habitat 2.0: Training home assistants to rearrange their habitat. Advances in Neural Information Processing Systems 34 (2021), 251–266.
  32. Rank-normalization, folding, and localization: An improved $\widehatR$ for assessing convergence of MCMC (with discussion). Bayesian Analysis 16, 2 (2021), 667–718.
  33. Predicting human intentions in human–robot hand-over tasks through multimodal learning. IEEE Transactions on Automation Science and Engineering 19, 3 (2021), 2339–2353.
  34. Probabilistic movement modeling for intention inference in human–robot interaction. The International Journal of Robotics Research 32, 7 (2013), 841–858.
  35. Anticipating Future Relations via Graph Growing for Action Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 2952–2960.
  36. Predicting Next Actions and Latent Intents during Text Formatting. In Proceedings of the CHI Workshop Computational Approaches for Understanding, Generating, and Adapting User Interfaces. 1–6.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Lei Shi (262 papers)
  2. Paul-Christian Bürkner (58 papers)
  3. Andreas Bulling (81 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets