Map-Aware Human Pose Prediction for Robot Follow-Ahead (2403.13294v1)
Abstract: In the robot follow-ahead task, a mobile robot is tasked to maintain its relative position in front of a moving human actor while keeping the actor in sight. To accomplish this task, it is important that the robot understand the full 3D pose of the human (since the head orientation can be different than the torso) and predict future human poses so as to plan accordingly. This prediction task is especially tricky in a complex environment with junctions and multiple corridors. In this work, we address the problem of forecasting the full 3D trajectory of a human in such environments. Our main insight is to show that one can first predict the 2D trajectory and then estimate the full 3D trajectory by conditioning the estimator on the predicted 2D trajectory. With this approach, we achieve results comparable or better than the state-of-the-art methods three times faster. As part of our contribution, we present a new dataset where, in contrast to existing datasets, the human motion is in a much larger area than a single room. We also present a complete robot system that integrates our human pose forecasting network on the mobile robot to enable real-time robot follow-ahead and present results from real-world experiments in multiple buildings on campus. Our project page, including supplementary material and videos, can be found at: https://qingyuan-jiang.github.io/iros2024_poseForecasting/
- Behavior control of the mobile robot for accompanying in front of a human. In 2012 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), pages 377–382. IEEE, 2012.
- Modeling human motion patterns for multi-robot planning. In 2012 IEEE International Conference on Robotics and Automation, pages 3161–3166. IEEE, 2012.
- STPOTR: Simultaneous Human Trajectory and Pose Prediction Using a Non-Autoregressive Transformer for Robot Following Ahead, September 2022. arXiv:2209.07600 [cs].
- Stochastic Scene-Aware Motion Prediction. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11354–11364, Montreal, QC, Canada, October 2021. IEEE.
- Geometric Pose Affordance: 3D Human Pose with Scene Constraints, December 2021. arXiv:1905.07718 [cs].
- Scene-aware Generative Network for Human Motion Synthesis. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12201–12210, Nashville, TN, USA, June 2021. IEEE.
- Diffusion-based generation, optimization, and planning in 3d scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16750–16761, 2023.
- Resolving 3D Human Pose Ambiguities with 3D Scene Constraints, August 2019. arXiv:1908.06963 [cs].
- Long-term Human Motion Prediction with Scene Context, July 2020. arXiv:2007.03672 [cs].
- The Hands-Free Push-Cart: Autonomous Following in Front by Predicting User Trajectory Around Obstacles. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 4548–4554, May 2018. ISSN: 2577-087X.
- Chance-constrained target tracking for mobile robots. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 409–414, Seattle, WA, USA, May 2015. IEEE.
- LBGP: Learning Based Goal Planning for Autonomous Following in Front. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 3140–3146, May 2021. ISSN: 2577-087X.
- From Goals, Waypoints & Paths To Long Term Human Trajectory Forecasting. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 15213–15222, Montreal, QC, Canada, October 2021. IEEE.
- Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving, March 2020. arXiv:1808.05819 [cs, stat].
- Robots That Can See: Leveraging Human Pose for Trajectory Prediction. IEEE Robotics and Automation Letters, pages 1–8, 2023. Conference Name: IEEE Robotics and Automation Letters.
- 3D Human Motion Prediction: A Survey, March 2022. arXiv:2203.01593 [cs].
- A generic diffusion-based approach for 3D human pose prediction in the wild, March 2023. arXiv:2210.05669 [cs].
- DMMGAN: Diverse Multi Motion Prediction of 3D Human Joints using Attention-Based Generative Adversarial Network. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9938–9944, May 2023.
- Ye Yuan and Kris Kitani. DLow: Diversifying Latent Flows for Diverse Human Motion Prediction, July 2020. arXiv:2003.08386 [cs, eess].
- MotionMixer: MLP-based 3D Human Body Pose Forecasting. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, pages 791–798, Vienna, Austria, July 2022. International Joint Conferences on Artificial Intelligence Organization.
- 3D Skeleton-based Human Motion Prediction with Manifold-Aware GAN, March 2022. arXiv:2203.00736 [cs].
- Learning Trajectory Dependencies for Human Motion Prediction. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9488–9496, Seoul, Korea (South), October 2019. IEEE.
- STTG-net: a Spatio-temporal network for human motion prediction based on transformer and graph convolution network. Visual Computing for Industry, Biomedicine, and Art, 5(1):19, December 2022.
- PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting, October 2022. arXiv:2210.10542 [cs].
- Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 2276–2284, Montreal, BC, Canada, October 2021. IEEE.
- A Spatio-temporal Transformer for 3D Human Motion Prediction. In 2021 International Conference on 3D Vision (3DV), pages 565–574, December 2021. ISSN: 2475-7888.
- Contact-aware human motion forecasting. Advances in Neural Information Processing Systems, 35:7356–7367, 2022.
- Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20428–20437, New Orleans, LA, USA, June 2022. IEEE.
- Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes, June 2021. arXiv:2012.05522 [cs].
- U-Net: Convolutional Networks for Biomedical Image Segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing.
- Integral Human Pose Regression, September 2018. arXiv:1711.08229 [cs].
- Human Pose Regression by Combining Indirect Part Detection and Contextual Information, October 2017. arXiv:1710.02322 [cs].
- Rover Robotics. Rover Robotics, 2023.
- ROS: an open-source Robot Operating System. In ICRA workshop on open source software, volume 3, page 5. Kobe, Japan, 2009. Issue: 3.2.
- RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation. Journal of Field Robotics, 36(2):416–446, 2019. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/rob.21831.
- T. Moore and D. Stouch. A Generalized Extended Kalman Filter Implementation for the Robot Operating System. In Proceedings of the 13th International Conference on Intelligent Autonomous Systems (IAS-13). Springer, July 2014.
- YOLO by Ultralytics, January 2023.
- Onboard View Planning of a Flying Camera for High Fidelity 3D Reconstruction of a Moving Actor, 2023. _eprint: 2308.00134.
- Richard Bellman. Dynamic programming. Science, 153(3731):34–37, 1966. Publisher: American Association for the Advancement of Science.
- The Office Marathon: Robust Navigation in an Indoor Office Environment. In International Conference on Robotics and Automation, 2010.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 211–220, Seattle, WA, USA, June 2020. IEEE.
- Adam: A Method for Stochastic Optimization, January 2017. arXiv:1412.6980 [cs].
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. CoRR, abs/1912.01703, 2019. arXiv: 1912.01703.