Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-View Active Sensing for Human-Robot Interaction via Hierarchically Connected Tree (2403.12538v1)

Published 19 Mar 2024 in cs.RO

Abstract: Comprehensive perception of human beings is the prerequisite to ensure the safety of human-robot interaction. Currently, prevailing visual sensing approach typically involves a single static camera, resulting in a restricted and occluded field of view. In our work, we develop an active vision system using multiple cameras to dynamically capture multi-source RGB-D data. An integrated human sensing strategy based on a hierarchically connected tree structure is proposed to fuse localized visual information. Constituting the tree model are the nodes representing keypoints and the edges representing keyparts, which are consistently interconnected to preserve the structural constraints during multi-source fusion. Utilizing RGB-D data and HRNet, the 3D positions of keypoints are analytically estimated, and their presence is inferred through a sliding widow of confidence scores. Subsequently, the point clouds of reliable keyparts are extracted by drawing occlusion-resistant masks, enabling fine registration between data clouds and cylindrical model following the hierarchical order. Experimental results demonstrate that our method enhances keypart recognition recall from 69.20% to 90.10%, compared to employing a single static camera. Furthermore, in overcoming challenges related to localized and occluded perception, the robotic arm's obstacle avoidance capabilities are effectively improved.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. L. Gualtieri, E. Rauch, and R. Vidoni, “Emerging research fields in safety and ergonomics in industrial collaborative robotics: A systematic literature review,” Robotics and Computer-Integrated Manufacturing, vol. 67, p. 101998, 2021.
  2. E. Matheson, R. Minto, E. G. Zampieri, M. Faccio, and G. Rosati, “Human–robot collaboration in manufacturing applications: A review,” Robotics, vol. 8, no. 4, p. 100, 2019.
  3. V. Villani, F. Pini, F. Leali, and C. Secchi, “Survey on human–robot collaboration in industrial settings: Safety, intuitive interfaces and applications,” Mechatronics, vol. 55, pp. 248–266, 2018.
  4. J. Park and H. Baek, “Stereo vision based obstacle collision avoidance for a quadrotor using ellipsoidal bounding box and hierarchical clustering,” Aerospace Science and Technology, vol. 103, p. 105882, 2020.
  5. L. Zeng and G. M. Bone, “Mobile robot collision avoidance in human environments,” International Journal of Advanced Robotic Systems, vol. 10, no. 1, p. 41, 2013.
  6. A. Bonci, P. D. Cen Cheng, M. Indri, G. Nabissi, and F. Sibona, “Human-robot perception in industrial environments: A survey,” Sensors, vol. 21, no. 5, p. 1571, 2021.
  7. B. Liu, H. Cai, Z. Ju, and H. Liu, “Rgb-d sensing based human action and interaction analysis: A survey,” Pattern Recognition, vol. 94, pp. 1–12, 2019.
  8. J. Mišeikis, K. Glette, O. J. Elle, and J. Torresen, “Multi 3d camera mapping for predictive and reflexive robot manipulator trajectory estimation,” in 2016 IEEE Symposium Series on Computational Intelligence (SSCI).   IEEE, 2016, pp. 1–8.
  9. S. Mehta and T. Burks, “Vision-based control of robotic manipulator for citrus harvesting,” Computers and electronics in agriculture, vol. 102, pp. 146–158, 2014.
  10. B. Rasolzadeh, M. Björkman, K. Huebner, and D. Kragic, “An active vision system for detecting, fixating and manipulating objects in the real world,” The International Journal of Robotics Research, vol. 29, no. 2-3, pp. 133–154, 2010.
  11. J. P. Barreto, L. Perdigoto, R. Caseiro, and H. Araujo, “Active stereo tracking of N≤3𝑁3N\leq 3italic_N ≤ 3 targets using line scan cameras,” IEEE Transactions on Robotics, vol. 26, no. 3, pp. 442–457, 2010.
  12. M. Oliveira and V. Santos, “Multi-camera active perception system with variable image perspective for mobile robot navigation,” in Intl. Conf. on Autonomous Robot Systems and Competitions, Aveiro, 2008.
  13. J. Li, J. Xu, F. Zhong, X. Kong, Y. Qiao, and Y. Wang, “Pose-assisted multi-camera collaboration for active object tracking,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 01, 2020, pp. 759–766.
  14. X. Huang, Y. Ying, and W. Dong, “Cease: Collision-evaluation-based active sense system for collaborative robotic arms,” 2024.
  15. N. Sarafianos, B. Boteanu, B. Ionescu, and I. A. Kakadiaris, “3d human pose estimation: A review of the literature and analysis of covariates,” Computer Vision and Image Understanding, vol. 152, pp. 1–20, 2016.
  16. X. Perez-Sala, S. Escalera, C. Angulo, and J. Gonzalez, “A survey on model based approaches for 2d and 3d visual human pose recovery,” Sensors, vol. 14, no. 3, pp. 4189–4210, 2014.
  17. C. Xu, X. Yu, Z. Wang, and L. Ou, “Multi-view human pose estimation in human-robot interaction,” in IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society.   IEEE, 2020, pp. 4769–4775.
  18. S. Chen, Y. Li, and N. M. Kwok, “Active vision in robotic systems: A survey of recent developments,” The International Journal of Robotics Research, vol. 30, no. 11, pp. 1343–1377, 2011.
  19. J. Wang, S. Tan, X. Zhen, S. Xu, F. Zheng, Z. He, and L. Shao, “Deep 3d human pose estimation: A review,” Computer Vision and Image Understanding, vol. 210, p. 103225, 2021.
  20. S. Amin, M. Andriluka, M. Rohrbach, and B. Schiele, “Multi-view pictorial structures for 3d human pose estimation.” in Bmvc, vol. 1, no. 2.   Bristol, UK, 2013.
  21. H. Qiu, C. Wang, J. Wang, N. Wang, and W. Zeng, “Cross view fusion for 3d human pose estimation,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 4342–4351.
  22. J. Dong, W. Jiang, Q. Huang, H. Bao, and X. Zhou, “Fast and robust multi-person 3d pose estimation from multiple views,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 7792–7801.
  23. Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2d pose estimation using part affinity fields,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7291–7299.
  24. K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
  25. Z. Li, A. Heyden, and M. Oskarsson, “Parametric model-based 3d human shape and pose estimation from multiple views,” in Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21.   Springer, 2019, pp. 336–347.
  26. Y. Huang, F. Bogo, C. Lassner, A. Kanazawa, P. V. Gehler, J. Romero, I. Akhter, and M. J. Black, “Towards accurate marker-less human shape and pose estimation over time,” in 2017 international conference on 3D vision (3DV).   IEEE, 2017, pp. 421–430.
  27. Z. Li, M. Oskarsson, and A. Heyden, “3d human pose and shape estimation through collaborative learning and multi-view model-fitting,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 1888–1897.
  28. J. Liang and M. C. Lin, “Shape-aware human pose and shape reconstruction using multi-view images,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 4352–4362.
  29. M. Hofmann and D. M. Gavrila, “Multi-view 3d human pose estimation in complex environment,” International journal of computer vision, vol. 96, pp. 103–124, 2012.
Citations (1)

Summary

We haven't generated a summary for this paper yet.