Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GazeMotion: Gaze-guided Human Motion Forecasting (2403.09885v2)

Published 14 Mar 2024 in cs.CV

Abstract: We present GazeMotion, a novel method for human motion forecasting that combines information on past human poses with human eye gaze. Inspired by evidence from behavioural sciences showing that human eye and body movements are closely coordinated, GazeMotion first predicts future eye gaze from past gaze, then fuses predicted future gaze and past poses into a gaze-pose graph, and finally uses a residual graph convolutional network to forecast body motion. We extensively evaluate our method on the MoGaze, ADT, and GIMO benchmark datasets and show that it outperforms state-of-the-art methods by up to 7.4% improvement in mean per joint position error. Using head direction as a proxy to gaze, our method still achieves an average improvement of 5.5%. We finally report an online user study showing that our method also outperforms prior methods in terms of perceived realism. These results show the significant information content available in eye gaze for human motion forecasting as well as the effectiveness of our method in exploiting this information.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. A. Belardinelli, A. R. Kondapally, D. Ruiken, D. Tanneberg, and T. Watabe, “Intention estimation from gaze and motion features for human-robot shared-control object manipulation,” in Proceedings of the 2022 IEEE International Conference on Intelligent Robots and Systems.   IEEE, 2022, pp. 9806–9813.
  2. L. Shi, C. Copot, and S. Vanlanduit, “Gazeemd: Detecting visual intention in gaze-based human-robot interaction,” Robotics, vol. 10, no. 2, p. 68, 2021.
  3. A. T. Le, P. Kratzer, S. Hagenmayer, M. Toussaint, and J. Mainprice, “Hierarchical human-motion prediction and logic-geometric programming for minimal interference human-robot tasks,” in Proceedings of the 2021 IEEE International Conference on Robot and Human Interactive Communication.   IEEE, 2021, pp. 7–14.
  4. J. Martinez, M. J. Black, and J. Romero, “On human motion prediction using recurrent neural networks,” in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2891–2900.
  5. W. Mao, M. Liu, M. Salzmann, and H. Li, “Multi-level motion attention for human motion prediction,” International Journal of Computer Vision, vol. 129, no. 9, pp. 2513–2535, 2021.
  6. T. Ma, Y. Nie, C. Long, Q. Zhang, and G. Li, “Progressively generating better initial guesses towards next stages for high-quality human motion prediction,” in Proceedings of the 2022 IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 6437–6446.
  7. W. Guo, Y. Du, X. Shen, V. Lepetit, X. Alameda-Pineda, and F. Moreno-Noguer, “Back to mlp: A simple baseline for human motion prediction,” in Proceedings of the 2023 IEEE Winter Conference on Applications of Computer Vision, 2023, pp. 4809–4819.
  8. E. G. Freedman, “Coordination of the eyes and head during visual orienting,” Experimental brain research, vol. 190, pp. 369–387, 2008.
  9. H. H. Goossens and A. V. Opstal, “Human eye-head coordination in two dimensions under different sensorimotor conditions,” Experimental Brain Research, vol. 114, pp. 542–560, 1997.
  10. Q. Sun, A. Patney, L.-Y. Wei, O. Shapira, J. Lu, P. Asente, S. Zhu, M. McGuire, D. Luebke, and A. Kaufman, “Towards virtual reality infinite walking: dynamic saccadic redirection,” ACM Transactions on Graphics, vol. 37, no. 4, pp. 1–13, 2018.
  11. K. J. Emery, M. Zannoli, J. Warren, L. Xiao, and S. S. Talathi, “Openneeds: A dataset of gaze, head, hand, and scene signals during exploration in open-ended vr environments,” in Proceedings of the 2021 ACM Symposium on Eye Tracking Research and Applications, 2021, pp. 1–7.
  12. Y. Zheng, Y. Yang, K. Mo, J. Li, T. Yu, Y. Liu, K. Liu, and L. J. Guibas, “Gimo: Gaze-informed human motion prediction in context,” in Proceedings of the 2022 European Conference on Computer Vision, 2022.
  13. P. Kratzer, S. Bihlmaier, N. B. Midlagajni, R. Prakash, M. Toussaint, and J. Mainprice, “Mogaze: A dataset of full-body motions that includes workspace geometry and eye-gaze,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 367–373, 2020.
  14. X. Pan, N. Charron, Y. Yang, S. Peters, T. Whelan, C. Kong, O. Parkhi, R. Newcombe, and Y. C. Ren, “Aria digital twin: A new benchmark dataset for egocentric 3d machine perception,” in Proceedings of the 2023 IEEE International Conference on Computer Vision, 2023, pp. 20 133–20 143.
  15. Z. Hu, C. Zhang, S. Li, G. Wang, and D. Manocha, “Sgaze: a data-driven eye-head coordination model for realtime gaze prediction,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 5, pp. 2002–2010, 2019.
  16. Z. Hu, A. Bulling, S. Li, and G. Wang, “Fixationnet: forecasting eye fixations in task-oriented virtual environments,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 5, pp. 2681–2690, 2021.
  17. N. F. Duarte, M. Raković, J. Tasevski, M. I. Coco, A. Billard, and J. Santos-Victor, “Action anticipation: Reading the intentions of humans and robots,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4132–4139, 2018.
  18. H. Kim, Y. Ohmura, and Y. Kuniyoshi, “Memory-based gaze prediction in deep imitation learning for robot manipulation,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 2427–2433.
  19. Z. Hu, S. Li, C. Zhang, K. Yi, G. Wang, and D. Manocha, “Dgaze: Cnn-based gaze prediction in dynamic scenes,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 5, pp. 1902–1911, 2020.
  20. W. Mao, M. Liu, M. Salzmann, and H. Li, “Learning trajectory dependencies for human motion prediction,” in Proceedings of the 2019 IEEE International Conference on Computer Vision, 2019, pp. 9489–9497.
  21. L. Sidenmark and H. Gellersen, “Eye, head and torso coordination during gaze shifts in virtual reality,” ACM Transactions on Computer-Human Interaction, vol. 27, no. 1, pp. 1–40, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com