Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors (2305.01599v1)

Published 2 May 2023 in cs.CV and cs.GR

Abstract: Human and environment sensing are two important topics in Computer Vision and Graphics. Human motion is often captured by inertial sensors, while the environment is mostly reconstructed using cameras. We integrate the two techniques together in EgoLocate, a system that simultaneously performs human motion capture (mocap), localization, and mapping in real time from sparse body-mounted sensors, including 6 inertial measurement units (IMUs) and a monocular phone camera. On one hand, inertial mocap suffers from large translation drift due to the lack of the global positioning signal. EgoLocate leverages image-based simultaneous localization and mapping (SLAM) techniques to locate the human in the reconstructed scene. On the other hand, SLAM often fails when the visual feature is poor. EgoLocate involves inertial mocap to provide a strong prior for the camera motion. Experiments show that localization, a key challenge for both two fields, is largely improved by our technique, compared with the state of the art of the two fields. Our codes are available for research at https://xinyu-yi.github.io/EgoLocate/.

Citations (35)

Summary

  • The paper introduces a framework that fuses inertial mocap with image-based SLAM to enhance real-time motion capture and localization.
  • It employs a novel mocap-aware bundle adjustment and Kalman filter to refine pose estimates and reduce tracking errors.
  • Experimental results on datasets like TotalCapture demonstrate reduced root position errors and improved stability for AR/VR applications.

An Overview of EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors

The paper "EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors" presents a sophisticated system combining human motion capture (mocap) and environmental sensing through simultaneous localization and mapping (SLAM). This integration addresses the intrinsic limitations of each approach when used independently and leverages their complementary strengths to achieve more accurate and robust motion analysis and localization in real-time applications.

Core Contributions and Methodology

EgoLocate Framework: The authors propose a framework utilizing six inertial measurement units (IMUs) and a monocular phone camera to perform real-time motion capture, localization, and mapping. The system selectively combines inertial mocap and image-based SLAM techniques to exploit their individual strengths.

  1. Inertial Motion Capture: The system begins with capturing human motion using IMUs. Building upon prior work like PIP, the authors refine the method by eliminating ground assumptions and force calculations, which are unsuitable for 3D environments where such data are unavailable.
  2. Camera Tracking: Utilizing principles from ORB-SLAM3, the paper introduces mocap-constrained camera tracking. This approach involves optimizing camera poses using robust feature point matching and mocap-derived priors, mitigating the impact of outliers and enhancing pose estimation accuracy.
  3. Mapping and Loop Closing: The framework features a novel mocap-aware bundle adjustment process wherein sparse mocap data are integrated into SLAM back-end optimization routines. A particular innovation is introducing mocap-related map point confidence, dynamically assigning weights to map point constraints in bundle adjustment, refining pose and mapping accuracy.
  4. Kalman Filter for Refinement: A prediction-correction algorithm based on Kalman filtering updates the state variables of human motion using mocap and SLAM data. This approach allows for continuously refining the motion estimates by adjusting for occlusions and motion artifacts detected by camera inputs.

Evaluation and Implications

The authors conduct comprehensive experiments using datasets such as TotalCapture and HPS, demonstrating that EgoLocate surpasses state-of-the-art methods in both mocap accuracy and SLAM robustness. Numerical results illustrate significant improvements, with EgoLocate notably reducing root position errors and enhancing tracking stability across diverse scenarios.

Practical Implications: The dynamic integration of mocap and SLAM can benefit numerous applications involving human-environment interaction, including virtual reality (VR), augmented reality (AR), and applications requiring precise motion planning or tracking in unconstrained environments.

Theoretical Implications: The paper also contributes a framework for further research into real-time human-environment sensing systems. The mutual reinforcement of mocap-derived constraints in SLAM optimizations and SLAM localization in mocap offers fresh pathways for developing more agile, adaptable perception systems.

Future Directions: The implementation raises potential areas for future development, such as improving scene understanding through denser environmental reconstructions or addressing more complex dynamic interactions within environments. Additionally, enhancing the handling of degenerate cases or expanding the deployment range to include outdoor or highly dynamic scenes represents promising evolutionary steps for such systems.

Conclusion

EgoLocate effectively merges inertial and visual data, minimizing their respective weaknesses while capitalizing on their strengths. This system exemplifies a balanced, practical integration of sophisticated algorithms, yielding superior mocap and mapping performance, and it stands as a foundational advancement in the comprehensive understanding of real-time motion and environment interaction.

Youtube Logo Streamline Icon: https://streamlinehq.com