Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3D Multi-frame Fusion for Video Stabilization (2404.12887v1)

Published 19 Apr 2024 in cs.CV and eess.IV

Abstract: In this paper, we present RStab, a novel framework for video stabilization that integrates 3D multi-frame fusion through volume rendering. Departing from conventional methods, we introduce a 3D multi-frame perspective to generate stabilized images, addressing the challenge of full-frame generation while preserving structure. The core of our approach lies in Stabilized Rendering (SR), a volume rendering module, which extends beyond the image fusion by incorporating feature fusion. The core of our RStab framework lies in Stabilized Rendering (SR), a volume rendering module, fusing multi-frame information in 3D space. Specifically, SR involves warping features and colors from multiple frames by projection, fusing them into descriptors to render the stabilized image. However, the precision of warped information depends on the projection accuracy, a factor significantly influenced by dynamic regions. In response, we introduce the Adaptive Ray Range (ARR) module to integrate depth priors, adaptively defining the sampling range for the projection process. Additionally, we propose Color Correction (CC) assisting geometric constraints with optical flow for accurate color aggregation. Thanks to the three modules, our RStab demonstrates superior performance compared with previous stabilizers in the field of view (FOV), image quality, and video stability across various datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 5855–5864, 2021.
  2. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 5470–5479, 2022.
  3. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 14124–14133, 2021a.
  4. Pixstabnet: Fast multi-scale deep online video stabilization with pixel-based warping. In Proceedings of IEEE International Conference on Image Processing (ICIP), pages 1929–1933, 2021b.
  5. Deep iterative frame interpolation for full-frame video stabilization. ACM Transactions on Graphics (TOG), 39(1):4:1–4:9, 2020.
  6. Video stabilization using epipolar geometry. ACM Transactions on Graphics (TOG), 31(5):126:1–126:10, 2012.
  7. Auto-directed video stabilization with robust L1 optimal camera paths. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 225–232, 2011.
  8. Adam: A method for stochastic optimization. In Proceedings of International Conference on Learning Representations (ICLR), 2015.
  9. Video stabilization based on feature trajectory augmentation and selection and robust mesh grid warping. IEEE Transactions on Image Processing (TIP), 24(12):5260–5273, 2015.
  10. Video stabilization using robust feature trajectories. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 1397–1404, 2009.
  11. 3d video stabilization with depth estimation by cnn-based optimization. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 10621–10630, 2021.
  12. Deep online video stabilization using IMU sensors. IEEE Transactions on Multimedia (TMM), 25:2047–2060, 2023a.
  13. Neural scene flow fields for space-time view synthesis of dynamic scenes. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 6498–6508, 2021.
  14. Dynibar: Neural dynamic image-based rendering. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 4273–4284, 2023b.
  15. Efficient neural radiance fields for interactive free-viewpoint video. In ACM SIGGRAPH Asia, pages 39:1–39:9, 2022.
  16. Direct photometric alignment by mesh deformation. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 2701–2709, 2017.
  17. Content-preserving warps for 3d video stabilization. ACM Transactions on Graphics (TOG), 28(3):44, 2009.
  18. Subspace video stabilization. ACM Transactions on Graphics (TOG), 30(1):4:1–4:10, 2011.
  19. Video stabilization with a depth camera. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 89–95, 2012.
  20. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG), 32(4):78:1–78:10, 2013.
  21. Steadyflow: Spatially smooth optical flow for video stabilization. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 4209–4216, 2014.
  22. Meshflow: Minimum latency online video stabilization. In Proceedings of European Conference on Computer Vision (ECCV), pages 800–815, 2016.
  23. Codingflow: Enable video coding for video stabilization. IEEE Transactions on Image Processing (TIP), 26(7):3291–3302, 2017.
  24. Hybrid neural fusion for full-frame video stabilization. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 2279–2288, 2021.
  25. Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 7210–7219, 2021.
  26. Progressively optimized local radiance fields for robust view synthesis. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 16539–16548, 2023.
  27. Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 38(4):29:1–29:14, 2019.
  28. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (TOG), 41(4):1–15, 2022.
  29. Softmax splatting for video frame interpolation. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 5436–5445, 2020.
  30. Deep online fused video stabilization. In Proceedings of Winter Conference on Applications of Computer Vision (WACV), pages 865–873. IEEE, 2022.
  31. Light field video stabilization. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 341–348, 2009.
  32. Grf: Learning a general radiance field for 3d representation and rendering. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 15182–15192, 2021.
  33. Deep online video stabilization with multi-grid warping transformation learning. IEEE Transactions on Image Processing (TIP), 28(5):2283–2292, 2019.
  34. Ibrnet: Learning multi-view image-based rendering. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 4690–4699, 2021.
  35. Point-nerf: Point-based neural radiance fields. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 5438–5448, 2022a.
  36. DUT: learning video stabilization by simply watching unstable videos. IEEE Transactions on Image Processing (TIP), 31:4306–4320, 2022b.
  37. pixelnerf: Neural radiance fields from one or few images. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 4578–4587, 2021.
  38. Selfie video stabilization. In Proceedings of European Conference on Computer Vision (ECCV), pages 569–584, 2018.
  39. Robust video stabilization by optimization in CNN weight space. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 3800–3808, 2019.
  40. Learning video stabilization using optical flow. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR), pages 8156–8164, 2020.
  41. Pwstablenet: Learning pixel-wise warping maps for video stabilization. IEEE Transactions on Image Processing (TIP), 29:3582–3595, 2020.
  42. Fast full-frame video stabilization with iterative optimization. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 23534–23544, 2023.

Summary

We haven't generated a summary for this paper yet.