Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation (2003.14414v1)

Published 31 Mar 2020 in cs.CV, cs.LG, cs.RO, and eess.IV

Abstract: We describe a method for 3D human pose estimation from transient images (i.e., a 3D spatio-temporal histogram of photons) acquired by an optical non-line-of-sight (NLOS) imaging system. Our method can perceive 3D human pose by `looking around corners' through the use of light indirectly reflected by the environment. We bring together a diverse set of technologies from NLOS imaging, human pose estimation and deep reinforcement learning to construct an end-to-end data processing pipeline that converts a raw stream of photon measurements into a full 3D human pose sequence estimate. Our contributions are the design of data representation process which includes (1) a learnable inverse point spread function (PSF) to convert raw transient images into a deep feature vector; (2) a neural humanoid control policy conditioned on the transient image feature and learned from interactions with a physics simulator; and (3) a data synthesis and augmentation strategy based on depth data that can be transferred to a real-world NLOS imaging system. Our preliminary experiments suggest that our method is able to generalize to real-world NLOS measurement to estimate physically-valid 3D human poses.

Citations (54)

Summary

  • The paper introduces a novel framework for 3D human pose estimation using optical non-line-of-sight (NLOS) transient images, integrating computational imaging, deep reinforcement learning, and physics-based modeling.
  • A key aspect is the use of synthetic transient image data generated from MoCap and depth data, along with augmentation strategies, to address limited real-world NLOS datasets and enhance model robustness.
  • Experimental results show the proposed system outperforms baselines in accuracy and physical plausibility, demonstrating generalization to real-world data and potential for applications like privacy-preserving surveillance.

Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation

The paper "Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation" presents an innovative approach to estimating human 3D poses using transient images from optical non-line-of-sight (NLOS) systems. The work integrates principles from computational imaging, human pose estimation, deep reinforcement learning, and dynamic physics-based modeling, establishing a novel framework that uniquely combines these domains.

Methodology

The researchers propose an end-to-end data processing pipeline capable of translating the raw stream of photon measurements to a coherent 3D human pose sequence. The methodology leverages transient images, essentially representing a 3D spatio-temporal histogram of photon travel time, allowing for visual information processing when the sensing device lacks a direct line of sight to the subject. Three core contributions stand out from the methodology:

  1. Learnable Inverse Point Spread Function (PSF): The pipeline involves a learnable inverse PSF, transforming raw transient images into feature vectors that facilitate further pose estimation processes. This component addresses noise and resolution issues inherent in transient imaging.
  2. Neural Humanoid Control Policy: Utilizing a physics simulator, this component learns a policy driven by deep reinforcement learning, ensuring that the pose estimation adheres to realistic human body dynamics and physics laws.
  3. Data Synthesis and Augmentation Strategy: The researchers address the challenge of limited real-world NLOS data by generating synthetic transient images from depth data. This section of the pipeline includes augmentation techniques to narrow the domain gap between synthetic and real-world data.

The transient images are obtained through a carefully orchestrated NLOS imaging process, involving a pulsed laser and a time-of-flight sensor, which records the radiation pattern as light reflects from environmental surfaces after hitting the hidden object. The dense dataset required to train the approach is synthetically generated, using MoCap data synchronized with a depth camera to simulate the pseudo-transient images.

Experimental Results

The system was rigorously tested in both synthetic environments and with real transient images. Results indicate that the proposed model can generalize effectively to unseen, real-world transient measurements. The integration of synthesis and data augmentation proved particularly useful in enhancing pose estimation robustness. Quantitatively, the system outperformed baseline approaches in terms of joint position accuracy (measured by MPJPE) and the physical plausibility of generated poses, confirmed by lower velocity error and improved smoothness metrics.

Implications and Future Work

This research represents an advancement in the capability of NLOS imaging technologies to interpret and reconstruct human activities without a direct line of sight, enhancing potential applications in privacy-preserving surveillance, autonomous navigation, and emergency response systems. The seamless blend of deep learning with physics-based modeling provides a pathway for more accurate and realistic pose estimation that could be expanded upon with improvements in real-time data processing and system miniaturization.

While the paper addresses noise and data availability challenges with synthetic datasets and augmentation strategies, future work could focus on optimizing the computational demands of this system for real-time deployment. Furthermore, expanding the system's adaptability across different environments and subjects could broaden its practical utility.

Overall, through the sophisticated integration of diverse technological areas, the authors have delivered a comprehensive and effective approach for 3D human pose estimation within the challenging framework of non-line-of-sight conditions. This work lays a substantial foundation upon which future exploration and development within the field of NLOS imaging and pose estimation can build.

Youtube Logo Streamline Icon: https://streamlinehq.com