Improving Visual Perception of a Social Robot for Controlled and In-the-wild Human-robot Interaction (2403.01766v2)
Abstract: Social robots often rely on visual perception to understand their users and the environment. Recent advancements in data-driven approaches for computer vision have demonstrated great potentials for applying deep-learning models to enhance a social robot's visual perception. However, the high computational demands of deep-learning methods, as opposed to the more resource-efficient shallow-learning models, bring up important questions regarding their effects on real-world interaction and user experience. It is unclear how will the objective interaction performance and subjective user experience be influenced when a social robot adopts a deep-learning based visual perception model. We employed state-of-the-art human perception and tracking models to improve the visual perception function of the Pepper robot and conducted a controlled lab study and an in-the-wild human-robot interaction study to evaluate this novel perception function for following a specific user with other people present in the scene.
- BoT-SORT: Robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 (2022).
- Evaluating the engagement with social robots. International Journal of Social Robotics 7 (2015), 465–478.
- A Kinect-Based Gesture Acquisition and Reproduction System for Humanoid Robots. In Computational Science and Its Applications–ICCSA 2020: 20th International Conference, Cagliari, Italy, July 1–4, 2020, Proceedings, Part I 20. Springer, 967–977.
- Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International journal of social robotics 1 (2009), 71–81.
- Refining the fusion of pepper robot and estimated depth maps method for improved 3D perception. IEEE Access 7 (2019), 185076–185085.
- The impact of a social robot public speaker on audience attention. In Proceedings of the 8th International Conference on Human-Agent Interaction. 60–68.
- Adapted pepper. arXiv preprint arXiv:2009.03648 (2020).
- Observation-centric sort: Rethinking sort for robust multi-object tracking. arXiv preprint arXiv:2203.14360 (2022).
- 2D Human pose estimation: a survey. Multimedia Systems (2022), 1–24.
- Deep learning based 2D human pose estimation: A survey. Tsinghua Science and Technology 24, 6 (2019), 663–676.
- Adaptive technique for brightness enhancement of automated knife detection in surveillance video with deep learning. Arabian Journal for Science and Engineering 46 (2021), 4049–4058.
- Complementary-view multiple human tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10917–10924.
- Robinson Jiménez-Moreno and Ricardo A Castillo. 2023. Deep learning speech recognition for residential assistant robot. IAES International Journal of Artificial Intelligence 12, 2 (2023), 585.
- Socially assistive robots as mental health interventions for children: a scoping review. International Journal of Social Robotics 13 (2021), 919–935.
- PoseAnalyser: A Survey on Human Pose Estimation. SN Computer Science 4, 2 (2023), 136.
- Low-light image and video enhancement using deep learning: A survey. IEEE transactions on pattern analysis and machine intelligence 44, 12 (2021), 9396–9416.
- Bottom-up pose estimation of multiple person with bounding box constraint. In 2018 24th international conference on pattern recognition (ICPR). IEEE, 115–120.
- Multi-person pose estimation using bounding box constraint and LSTM. IEEE Transactions on Multimedia 21, 10 (2019), 2653–2663.
- Human–robot collaboration in construction: classification and research trends. Journal of Construction Engineering and Management 147, 10 (2021), 03121006.
- Multiple object tracking: A literature review. Artificial intelligence 293 (2021), 103448.
- YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2637–2646.
- Object Recognition in Different Lighting Conditions at Various Angles by Deep Learning Method. arXiv preprint arXiv:2210.09618 (2022).
- Thong Duy Nguyen and Milan Kresovic. 2022. A survey of top-down approaches for human pose estimation. arXiv preprint arXiv:2202.02656 (2022).
- Amit Kumar Pandey and Rodolphe Gelin. 2018. A mass-produced sociable humanoid robot: Pepper: The first machine of its kind. IEEE Robotics & Automation Magazine 25, 3 (2018), 40–48.
- Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review. ACM Transactions on Human-Robot Interaction 12, 1 (2023), 1–66.
- Improving LEO robot conversational ability via deep learning algorithms for children with autism. In 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS). IEEE, 416–420.
- Tarek Stiebel and Dorit Merhof. 2020. Brightness invariant deep spectral super-resolution. Sensors 20, 20 (2020), 5789.
- Pepper learns together with children: Development of an educational application. In 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids). IEEE, 270–275.
- Planar surface SLAM with 3D and 2D sensors. In 2012 IEEE International Conference on Robotics and Automation. IEEE, 3041–3048.
- Spicing up hospitality service encounters: the case of Pepper™. International Journal of Contemporary Hospitality Management (2021).
- YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022).
- Pose2seg: Detection free human instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 889–898.
- Bytetrack: Multi-object tracking by associating every detection box. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXII. Springer, 1–21.
- DPIT: Dual-Pipeline Integrated Transformer for Human Pose Estimation. In Artificial Intelligence: Second CAAI International Conference, CICAI 2022, Beijing, China, August 27–28, 2022, Revised Selected Papers, Part II. Springer, 559–576.
- Wangjie Zhong (1 paper)
- Leimin Tian (12 papers)
- Duy Tho Le (2 papers)
- Hamid Rezatofighi (61 papers)