Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera (2410.08107v4)

Published 10 Oct 2024 in cs.CV

Abstract: Implicit neural representation and explicit 3D Gaussian Splatting (3D-GS) for novel view synthesis have achieved remarkable progress with frame-based camera (e.g. RGB and RGB-D cameras) recently. Compared to frame-based camera, a novel type of bio-inspired visual sensor, i.e. event camera, has demonstrated advantages in high temporal resolution, high dynamic range, low power consumption and low latency. Due to its unique asynchronous and irregular data capturing process, limited work has been proposed to apply neural representation or 3D Gaussian splatting for an event camera. In this work, we present IncEventGS, an incremental 3D Gaussian Splatting reconstruction algorithm with a single event camera. To recover the 3D scene representation incrementally, we exploit the tracking and mapping paradigm of conventional SLAM pipelines for IncEventGS. Given the incoming event stream, the tracker firstly estimates an initial camera motion based on prior reconstructed 3D-GS scene representation. The mapper then jointly refines both the 3D scene representation and camera motion based on the previously estimated motion trajectory from the tracker. The experimental results demonstrate that IncEventGS delivers superior performance compared to prior NeRF-based methods and other related baselines, even we do not have the ground-truth camera poses. Furthermore, our method can also deliver better performance compared to state-of-the-art event visual odometry methods in terms of camera motion estimation. Code is publicly available at: https://github.com/wu-cvgl/IncEventGS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5501–5510, 2022.
  2. Colmap-free 3d gaussian splatting. arXiv preprint arXiv:2312.07504, 2023.
  3. A unifying contrast maximization framework for event cameras, with applications to motion, depth and optical flow estimation. In Computer Vision and Pattern Recognition (CVPR), 2018.
  4. Video to events: Recycling video datasets for event cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  3586–3595, 2020.
  5. Michael Grupp. evo: Python package for the evaluation of odometry and slam. https://github.com/MichaelGrupp/evo, 2017.
  6. Cg-slam: Efficient dense rgb-d slam in a consistent uncertainty-aware 3d gaussian field. arXiv preprint arXiv:2403.16095, 2024.
  7. Photo-slam: Real-time simultaneous localization and photorealistic mapping for monocular, stereo, and rgb-d cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
  8. Ev-nerf: Event-based neural radiance fields. In Winter Conference on Applications of Computer Vision (WACV), 2023.
  9. Repurposing diffusion-based image generators for monocular depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
  10. Splatam: Splat, track & map 3d gaussians for dense rgb-d slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
  11. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), July 2023. URL https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/.
  12. Real-time 3d reconstruction and 6-dof tracking with an event camera. In European Conference on Computer Vision (ECCV), 2016.
  13. Tum-vie: The tum stereo visual-inertial event dataset. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.  8601–8608. IEEE, 2021.
  14. E-nerf: Neural radiance fields from a moving event camera. In IEEE Robotics and Automation Letters (RAL), 2022.
  15. Deep event visual odometry. In 2024 International Conference on 3D Vision (3DV), pp.  739–749. IEEE, 2024.
  16. Benerf: Neural radiance fields from a single blurry image and event stream. In European Conference on Computer Vision (ECCV), 2024.
  17. Robust e-nerf: Nerf from sparse and noisy events under non-uniform motion. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  18. Gaussian splatting slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  18039–18048, 2024.
  19. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  20. Instant neural graphics primitives with a multiresolution hash encoding. ACM transactions on graphics (TOG), 41(4):1–15, 2022.
  21. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5):1255–1262, 2017.
  22. Implicit event-rgbd neural slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
  23. Evo: A geometric approach to event-based 6-dof parallel tracking and mapping in real time. IEEE Robotics and Automation Letters, 2017.
  24. High speed and high dynamic range video with an event camera. IEEE transactions on pattern analysis and machine intelligence, 43(6):1964–1980, 2019.
  25. Eventnerf: Neural radiance fields from a single colour event camera. In Computer Vision and Pattern Recognition (CVPR), 2023a.
  26. Eventnerf: Neural radiance fields from a single colour event camera. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  4992–5002, 2023b.
  27. Structure-from-motion Revisited. In Computer Vision and Pattern Recognition (CVPR), pp.  4104–4113, 2016.
  28. Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  29. The Replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.
  30. DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras. Advances in neural information processing systems, 2021.
  31. Deep patch visual odometry. Advances in Neural Information Processing Systems, 2023.
  32. Evggs: A collaborative learning framework for event-based generalizable gaussian splatting. arXiv preprint arXiv:2405.14959, 2024.
  33. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  34. Event3dgs: Event-based 3d gaussian splatting for fast egomotion. arXiv preprint arXiv:2406.02972, 2024.
  35. Gs-slam: Dense visual slam with 3d gaussian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
  36. Gaussian-slam: Photo-realistic dense slam with gaussian splatting. arXiv preprint arXiv:2312.10070, 2023.
  37. Nice-slam: Neural implicit scalable encoding for slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  12786–12796, 2022.

Summary

  • The paper introduces a pose-free incremental 3D reconstruction method using Gaussian splatting with event camera data, eliminating dependence on ground-truth poses.
  • It employs a SLAM-like framework that processes event streams in chunks and continuously interpolates camera motion in SE(3) space to refine 3D Gaussian scene representations.
  • Experimental results demonstrate enhanced novel view synthesis and improved trajectory estimation, outperforming established NeRF-based and odometry methods in dynamic conditions.

Overview of IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera

The paper "IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera" introduces IncEventGS, a novel algorithm for incremental 3D reconstruction using a single event camera. This approach diverges from traditional frame-based camera systems, exploiting the advantages of event cameras such as high temporal resolution, dynamic range, and reduced latency. Key to this research is the concept of Gaussian splatting, a method traditionally used with known camera poses. IncEventGS innovatively eliminates the need for ground-truth poses, presenting a robust alternative for scenarios where such data is unavailable.

Technical Approach

IncEventGS leverages Gaussian splatting and implicitly integrates it with event camera data by adopting a continuous trajectory model within a SLAM-like framework. The technique involves:

  • Event Stream Processing: Incoming event data is divided into chunks, with an innovative method to treat each as a distinct “image.” The system uses a continuous interpolation of the camera motion trajectory in se3\mathfrak{se}3 space to compute poses.
  • 3D Gaussian Representation: The scene is expressed using 3D Gaussian splatting, building upon previous advancements in radiance field representations. This choice enhances image rendering quality and computational efficiency.
  • SLAM Paradigm: The tracking and mapping stages mirror conventional SLAM methodologies. During tracking, camera poses are iteratively refined, while mapping focuses on optimizing the 3D Gaussian scene representation and capturing incremental scene changes.

Experimental Results

The paper reports superior performance of IncEventGS across both synthetic and real-world datasets. The algorithm's ability to recover accurate 3D scene representations without ground-truth poses is underscored by comparisons with NeRF-based methods and two-stage systems that integrate external odometry solutions. Key findings include:

  • Novel View Synthesis: IncEventGS significantly outperforms baselines, achieving higher PSNR and SSIM values alongside lower LPIPS scores, reflecting superior image quality. The robust handling of challenging lighting conditions demonstrates the effectiveness of event-based approaches over traditional frame-based methods.
  • Camera Trajectory Estimation: The new methodology provides better absolute trajectory error (ATE) metrics compared to existing event-based and visual odometry techniques. This highlights the enhanced flexibility and application potential in real-world settings where pose estimation is crucial.

Implications and Future Work

From a practical standpoint, IncEventGS opens new avenues for applications in augmented reality, robotics, and autonomous navigation. By removing dependencies on ground-truth pose data, the method can be deployed in environments with dynamic or unknown lighting conditions where traditional sensors fail.

Theoretically, this work sets a precedent for further exploration into the integration of neural representations with event camera data. Future research directions may focus on optimizing computational efficiency, exploring additional sensor integrations, and expanding the approach to support multiple event sources or collaborative settings.

IncEventGS represents a step towards more adaptive and robust computer vision systems, leveraging the unique characteristics of event cameras. It exemplifies how novel combinations of artificial intelligence and sensor technologies can overcome longstanding challenges in real-time 3D scene reconstruction.

X Twitter Logo Streamline Icon: https://streamlinehq.com