Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video (2312.13528v2)

Published 21 Dec 2023 in cs.CV

Abstract: Neural Radiance Fields (NeRF), initially developed for static scenes, have inspired many video novel view synthesis techniques. However, the challenge for video view synthesis arises from motion blur, a consequence of object or camera movement during exposure, which hinders the precise synthesis of sharp spatio-temporal views. In response, we propose a novel dynamic deblurring NeRF framework for blurry monocular video, called DyBluRF, consisting of a Base Ray Initialization (BRI) stage and a Motion Decomposition-based Deblurring (MDD) stage. Our DyBluRF is the first that handles the novel view synthesis for blurry monocular video with a novel two-stage framework. In the BRI stage, we coarsely reconstruct dynamic 3D scenes and jointly initialize the base ray, which is further used to predict latent sharp rays, using the inaccurate camera pose information from the given blurry frames. In the MDD stage, we introduce a novel Incremental Latent Sharp-rays Prediction (ILSP) approach for the blurry monocular video frames by decomposing the latent sharp rays into global camera motion and local object motion components. We further propose two loss functions for effective geometry regularization and decomposition of static and dynamic scene components without any mask supervision. Experiments show that DyBluRF outperforms qualitatively and quantitatively the SOTA methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Rignerf: Fully controllable neural 3d portraits. In CVPR, 2022.
  2. Hyperreel: High-fidelity 6-dof video with ray-conditioned sampling. In CVPR, pages 16610–16620, 2023.
  3. Non-uniform blind deblurring by reblurring. In ICCV, pages 3286–3294, 2017.
  4. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In CVPR, pages 5470–5479, 2022.
  5. Optimizing the latent space of generative networks. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, 2018.
  6. Immersive light field video with a layered mesh representation. ACM Transactions on Graphics (TOG), 39(4):86–1, 2020.
  7. Hexplane: A fast representation for dynamic scenes. In CVPR, pages 130–141, 2023.
  8. Tensorf: Tensorial radiance fields. In ECCV, pages 333–350. Springer, 2022.
  9. Garf: gaussian activated radiance fields for high fidelity reconstruction and pose estimation. arXiv e-prints, pages arXiv–2204, 2022.
  10. Learning temporal coherence via self-supervision for gan-based video generation. ACM Transactions on Graphics (TOG), 39(4):75–1, 2020.
  11. High-quality streamable free-viewpoint video. ACM Transactions on Graphics (ToG), 34(4):1–13, 2015.
  12. Multi-scale separable network for ultra-high-definition video deblurring. In ICCV, pages 14030–14039, 2021.
  13. Volume rendering. ACM Siggraph Computer Graphics, 22(4):65–74, 1988.
  14. Neural radiance flow for 4d view synthesis and video processing. In ICCV, pages 14304–14314. IEEE Computer Society, 2021.
  15. Fast dynamic radiance fields with time-aware neural voxels. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022.
  16. Plenoxels: Radiance fields without neural networks. In CVPR, pages 5501–5510, 2022.
  17. K-planes: Explicit radiance fields in space, time, and appearance. In CVPR, pages 12479–12488, 2023.
  18. Dynamic view synthesis from dynamic monocular video. In ICCV, pages 5712–5721, 2021.
  19. Dynamic novel-view synthesis: A reality check. In NeurIPS, 2022.
  20. Single image deblurring using motion density functions. In ECCV, pages 171–184. Springer, 2010.
  21. Space-variant single-image blind deconvolution for removing camera shake. NeurIPS, 23:829–837, 2010.
  22. Neuman: Neural human radiance field from a single video. In ECCV, 2022.
  23. Alignerf: High-fidelity neural radiance fields via alignment-aware training. In CVPR, pages 46–55, 2023.
  24. Dp-nerf: Deblurred neural radiance field with physical scene priors. In CVPR, pages 12386–12396, 2023a.
  25. Exblurf: Efficient radiance fields for extreme motion blurred images. In ICCV, pages 17639–17648, 2023b.
  26. Neural 3d video synthesis from multi-view video. In CVPR, pages 5521–5531, 2022.
  27. Neural scene flow fields for space-time view synthesis of dynamic scenes. In CVPR, pages 6498–6508, 2021.
  28. Dynibar: Neural dynamic image-based rendering. In CVPR, 2023.
  29. Recurrent video restoration transformer with guided deformable attention. In NeurIPS, 2022.
  30. Barf: Bundle-adjusting neural radiance fields. In ICCV, pages 5741–5751, 2021.
  31. Pixel-perfect structure-from-motion with featuremetric refinement. In ICCV, pages 5987–5997, 2021.
  32. Robust dynamic radiance fields. In CVPR, pages 13–23, 2023.
  33. Deblur-nerf: Neural radiance fields from blurry images. In CVPR, pages 12861–12870, 2022.
  34. Gnerf: Gan-based neural radiance field without posed camera. In ICCV, pages 6351–6361, 2021.
  35. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  36. Demfi: deep joint deblurring and multi-frame interpolation with flow-guided attentive correlation and recursive boosting. In ECCV, pages 198–215. Springer, 2022.
  37. Generalized connectivity constraints for spatio-temporal 3d reconstruction. In ECCV, pages 32–46. Springer, 2014.
  38. Blind image deblurring using dark channel prior. In CVPR, pages 1628–1636, 2016.
  39. Cascaded deep video deblurring using temporal sharpness prior. In CVPR, pages 3043–3051, 2020.
  40. Deep discriminative spatial and temporal network for efficient video deblurring. In CVPR, pages 22191–22200, 2023.
  41. Nerfies: Deformable neural radiance fields. In ICCV, pages 5865–5874, 2021a.
  42. Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 40(6), 2021b.
  43. Camp: Camera preconditioning for neural radiance fields. CoRR, 2023a.
  44. Temporal interpolation is all you need for dynamic neural radiance fields. In CVPR, pages 4212–4221, 2023b.
  45. Xavier Pennec. Computing the mean of geometric features application to the mean rotation. PhD thesis, INRIA, 1998.
  46. Terminerf: Ray termination prediction for efficient neural rendering. In 3DV, pages 1106–1114, 2021.
  47. D-nerf: Neural radiance fields for dynamic scenes. In CVPR, pages 10318–10327, 2021.
  48. Vision transformers for dense prediction. In ICCV, pages 12159–12168, 2021.
  49. How reliable is structure from motion (sfm) over time and between observers? a case study using coral reef bommies. Remote Sensing, 9(7):740, 2017.
  50. Structure-from-motion revisited. In CVPR, pages 4104–4113, 2016.
  51. Tensor4d: Efficient neural 4d decomposition for high-fidelity dynamic reconstruction and rendering. In CVPR, pages 16632–16642, 2023.
  52. Blurry video frame interpolation. In CVPR, pages 5114–5123, 2020.
  53. Xvfi: extreme video frame interpolation. In ICCV, pages 14489–14498, 2021.
  54. Nerfplayer: A streamable dynamic scene representation with decomposed neural radiance fields. IEEE Transactions on Visualization and Computer Graphics, 29(5):2732–2742, 2023.
  55. Synthetic shutter speed imaging. In Computer Graphics Forum, pages 591–598. Wiley Online Library, 2007.
  56. Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In ICCV, pages 12959–12970, 2021.
  57. Revealing occlusions with 4d neural fields. In CVPR, pages 3011–3021, 2022.
  58. Fourier plenoctrees for dynamic radiance field rendering in real-time. In CVPR, pages 13524–13534, 2022a.
  59. Bad-nerf: Bundle adjusted deblur neural radiance fields. In CVPR, pages 4170–4179, 2023.
  60. Efficient video deblurring guided by motion magnitude. In ECCV, 2022b.
  61. Nerf-: Neural radiance fields without known camera parameters. CoRR, 2021.
  62. Humannerf: Free-viewpoint rendering of moving people from monocular video. In CVPR, pages 16210–16220, 2022a.
  63. HumanNeRF: Free-viewpoint rendering of moving people from monocular video. In CVPR, pages 16210–16220, 2022b.
  64. Space-time neural irradiance fields for free-viewpoint video. In CVPR, pages 9421–9431, 2021.
  65. Banmo: Building animatable 3d neural models from many casual videos. In CVPR, 2022.
  66. Spatio-temporal deformable attention network for video deblurring. In ECCV, 2022.
  67. Adversarial spatio-temporal learning for video deblurring. IEEE Transactions on Image Processing, 28(1):291–301, 2018.
  68. Deblurring by realistic blurring. In CVPR, pages 2737–2746, 2020.
  69. Spacetime stereo: Shape recovery for dynamic scenes. In CVPR, pages II–367. IEEE, 2003.
  70. Spatio-temporal filter adaptive network for video deblurring. In ICCV, pages 2482–2491, 2019.
  71. Exploring temporal frequency spectrum in deep video deblurring. In ICCV, pages 12428–12437, 2023.
  72. High-quality video view interpolation using a layered representation. ACM transactions on graphics (TOG), 23(3):600–608, 2004.
Citations (4)

Summary

  • The paper introduces DyBluRF, a framework that reconstructs sharp video frames from blurred monocular footage using a two-stage approach.
  • The first stage, Interleave Ray Refinement, tackles inaccurate camera poses by coarsely reconstructing and refining dynamic scene elements.
  • The second stage, Motion Decomposition-based Deblurring, employs Incremental Latent Sharp-Rays Prediction to separate global and local motions, achieving superior visual quality.

Understanding DyBluRF: Enhancing Video Quality with AI

Overview of DyBluRF

Dynamic Deblurring Neural Radiance Fields (DyBluRF) is a revolutionary framework that aims to improve the synthesis of sharp video frames from blurry monocular video footage. Its effectiveness shines through when dealing with inaccurate camera poses commonly found in monocular videos, a scenario where previous state-of-the-art methods struggle.

Challenges in Video Synthesis

Creating immersive video experiences often requires synthesizing video views that are both sharp and temporally consistent. An obstacle to achieving this is motion blur, which occurs when there is movement within a frame during the camera's exposure time. Traditionally, this has been addressed by applying video deblurring techniques prior to using Neural Radiance Fields (NeRF) optimization. However, this approach has limitations, leading to inconsistencies and subpar quality when reconstructing dynamic 3D scenes.

DyBluRF's Solution

DyBluRF introduces two critical stages to address the challenges posed by motion blur:

  1. Interleave Ray Refinement (IRR) Stage: Here, dynamic scenes are coarsely reconstructed, and the process refines inaccurate camera pose information. This stage ensures that dynamic elements within scenes are accurately mapped, and the issue of imprecise camera poses is addressed by interleave optimization, resulting in a more nuanced and precise reconstruction.
  2. Motion Decomposition-based Deblurring (MDD) Stage: This innovative stage deals with the blurriness introduced by both camera and object movement. It uses a novel technique—Incremental Latent Sharp-Rays Prediction (ILSP)—which effectively decomposes the motion into global camera movement and local object motion, allowing for a more accurate rendering of sharp spatio-temporal views.

Experimental Validation

DyBluRF was put to the test using a newly synthesized dataset called the Blurry iPhone Dataset, designed to challenge the deblurring algorithms with realistic camera poses. The results demonstrate that DyBluRF outperforms existing methods in terms of both qualitative and quantitative measures. It achieves superior performance by capturing better structural details and more consistent motion rendering.

Implications of DyBluRF

The implications of these advancements are substantial. DyBluRF paves the way for better quality virtual reality content, improved video post-processing, and could serve as a robust tool for filmmakers and video content creators. By dynamically deblurring and synthesizing high-quality videos, the technology introduces new possibilities in realms where immersion and detail are paramount.

Conclusion

DyBluRF marks a significant step forward in video view synthesis. The framework's ability to handle inaccuracies in camera pose and motion blur opens new doors to video enhancement and could potentially transform the field of video processing. This breakthrough not only enhances user experiences but also provides a new baseline for future research within the domain of video quality enhancement.

Youtube Logo Streamline Icon: https://streamlinehq.com