- The paper presents a novel pipeline integrating OpenPose-based detection, 3D tracking, and X3D action recognition on mobile robots.
- The pipeline operates efficiently on edge devices, enabling near real-time decision-making despite environmental challenges.
- Extensive experiments using a custom dataset demonstrate high prediction accuracy and resource-efficient performance in dynamic real-world settings.
Introduction to HAR in Mobile Robots
The role of robots in society is expanding, particularly in areas such as healthcare and ambient assisted living (AAL), where they can provide valuable assistance. These service robots are increasingly sophisticated, often being tasked to interpret human activities and respond accordingly. This is where Human Action Recognition (HAR) comes into play, a complex process which entails detecting human presence, tracking movements, and ultimately understanding human actions.
The Proposed Pipeline
To ensure efficient and reliable HAR on mobile service robots, we present a comprehensive pipeline that encompasses the entire process, from initial detection to action recognition. This pipeline is designed to operate effectively in near real-time on-edge devices – meaning that it runs directly on the robot rather than relying on additional remote processing power. This is critical for autonomous mobile robots that need to make swift decisions based on human activity.
The process starts with human detection, utilizing a solution known as OpenPose for locating and identifying the skeletal keypoints of humans captured by the robot's camera. From there, recognized individuals can be tracked in 3D within the robot's operating space. This tracking is crucial for the robot to contextualize the observations in its environment. The next steps involve identifying the user, employing a face recognition algorithm followed by action recognition which leverages the robot's sensory data to understand and categorize human actions.
Challenges and Solutions
Several challenges exist when deploying HAR systems on mobile robots. For instance, there can be issues relating to viewpoint variation, inconsistent lighting, and obstructions in the robot's field of vision. Overcoming these requires not only the carefully designed pipeline but also choosing lightweight yet efficient algorithms that can handle such challenges in real-time without overwhelming the robot's computational capabilities.
To determine the most fitting models for our system, we conducted extensive experiments comparing state-of-the-art detection and recognition solutions, focusing on both efficiency and detection performance. Our research led to the implementation of efficient algorithms like OpenPose for user detection and X3D for action recognition.
Dataset and Experimental Results
To evaluate our system, we compiled a robust dataset that combines publicly available datasets with data specifically captured for this project. This new dataset contains a variety of daily activities recorded from the robot's perspective, thus addressing the real-world scenarios the robot would encounter. When assessing various HAR models against our dataset, we prioritized models that offered a good balance between accuracy and resource utilization.
The efficacy of our approach is proven through rigorous testing on a dedicated mobile robot platform, which showed promising real-time performance while maintaining high prediction accuracy. We also introduced variations in the experimentation, like simulating different environmental conditions and user interactions, ensuring our system is well-adapted to the dynamic and varied real-world applications.
Conclusion
The contribution of this work is significant as it presents a viable end-to-end solution for recognizing human actions via mobile robots in real-time circumstances. Not only does it promise to enhance the capability of service robots in interpreting human behavior, but it also sets a precedent for future advancements in the field of robotic human perception. Our findings offer a pathway to improve the interaction between service robots and their human counterparts, potentially transforming how such robots can be utilized in critical sectors like healthcare.