S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes (2403.06205v3)
Abstract: Current 3D stylization methods often assume static scenes, which violates the dynamic nature of our real world. To address this limitation, we present S-DyRF, a reference-based spatio-temporal stylization method for dynamic neural radiance fields. However, stylizing dynamic 3D scenes is inherently challenging due to the limited availability of stylized reference images along the temporal axis. Our key insight lies in introducing additional temporal cues besides the provided reference. To this end, we generate temporal pseudo-references from the given stylized reference. These pseudo-references facilitate the propagation of style information from the reference to the entire dynamic 3D scene. For coarse style transfer, we enforce novel views and times to mimic the style details present in pseudo-references at the feature level. To preserve high-frequency details, we create a collection of stylized temporal pseudo-rays from temporal pseudo-references. These pseudo-rays serve as detailed and explicit stylization guidance for achieving fine style transfer. Experiments on both synthetic and real-world datasets demonstrate that our method yields plausible stylized results of space-time view synthesis on dynamic 3D scenes.
- Hexplane: A fast representation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 130–141, 2023.
- Psnet: A style transfer network for point cloud stylization on geometry and color. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer vision (WACV), pages 3337–3345, 2020.
- Tensorf: Tensorial radiance fields. In Proceedings of the European Conference on Computer Vision (ECCV), pages 333–350. Springer, 2022a.
- Coherent online video style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1105–1114, 2017.
- Upst-nerf: Universal photorealistic style transfer of neural radiance fields for 3d scene. arXiv preprint arXiv:2208.07059, 2022b.
- Stylizing 3d scene via implicit representation and hypernetwork. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1475–1484, 2022.
- Unified implicit neural stylization. In Proceedings of the European Conference on Computer Vision (ECCV), pages 636–654. Springer, 2022.
- Stylit: illumination-guided example-based stylization of 3d renderings. ACM Transactions on Graphics (TOG), 35(4):1–11, 2016.
- Real-time patch-based stylization of portraits using generative adversarial network. In Proceedings of the 8th ACM/Eurographics Expressive Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering, pages 33–42, 2019.
- Image style transfer using convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, 2016.
- Instruct-nerf2nerf: Editing 3d scenes with instructions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Progressive color transfer with dense semantic correspondences. ACM Transactions on Graphics (TOG), 38(2):1–18, 2019.
- Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 327–340, 2001.
- Stylemesh: Style transfer for indoor 3d scene reconstructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6198–6208, 2022.
- Real-time neural style transfer for videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 783–791, 2017.
- Learning to stylize novel views. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 13869–13878, 2021.
- Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 1501–1510, 2017.
- Stylizednerf: consistent 3d scene stylization as stylized nerf via 2d-3d mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18342–18352, 2022.
- Stylizing video by example. ACM Transactions on Graphics (TOG), 38(4):1–11, 2019.
- Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision (ECCV), pages 694–711. Springer, 2016.
- Neural 3d mesh renderer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3907–3916, 2018.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Neural neighbor style transfer. arXiv preprint arXiv:2203.13215, 2022.
- Learning blind video temporal consistency. In Proceedings of the European Conference on Computer Vision (ECCV), pages 170–185, 2018.
- Neural 3d video synthesis from multi-view video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5521–5531, 2022a.
- Learning linear transformations for fast image and video style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3809–3817, 2019.
- Symmnerf: Learning to explore symmetry prior for single-view view synthesis. In Proceedings of the Asian Conference on Computer Vision (ACCV), pages 1726–1742, 2022b.
- 3d cinemagraphy from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4595–4605, 2023.
- Universal style transfer via feature transforms. Advances in Neural Information Processing Systems (NeurIPS), 30, 2017.
- Neural scene flow fields for space-time view synthesis of dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6498–6508, 2021.
- Visual attribute transfer through deep image analogy. ACM Transactions on Graphics (TOG), 36(4):1–15, 2017.
- Stylerf: Zero-shot 3d style transfer of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8338–8348, 2023.
- NeRF: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- 3d photo stylization: Learning to generate stylized novel views from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16273–16282, 2022.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- Snerf: stylized neural implicit representations for 3d scenes. ACM Transactions on Graphics (TOG), 41(4):1–11, 2022.
- Locally stylized neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 307–316, 2023.
- D-NeRF: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10318–10327, 2021.
- 3dsnet: Unsupervised shape-to-shape 3d style transfer. arXiv preprint arXiv:2011.13388, 2020.
- Control4d: Dynamic portrait editing by learning 4d gan from 2d diffusion-based editor. arXiv preprint arXiv:2305.20082, 2023.
- Make-it-4d: Synthesizing a consistent long-term dynamic scene video from a single image. In Proceedings of the 31st ACM International Conference on Multimedia (ACM MM), pages 8167–8175, 2023.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Two-stage peer-regularized feature recombination for arbitrary image style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13816–13825, 2020.
- RAFT: Recurrent all-pairs field transforms for optical flow. In Proceedings of the European Conference on Computer Vision (ECCV), pages 402–419, 2020.
- Interactive video stylization using few-shot patch-based training. ACM Transactions on Graphics (TOG), 39(4):73–1, 2020.
- Clip-nerf: Text-and-image driven manipulation of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3835–3844, 2022.
- Nerf-art: Text-driven neural radiance fields stylization. IEEE Transactions on Visualization and Computer Graphics, 2023.
- Dof-nerf: Depth-of-field meets neural radiance fields. In Proceedings of the 30th ACM International Conference on Multimedia (ACM MM), pages 1718–1729, 2022a.
- Ccpl: contrastive coherence preserving loss for versatile style transfer. In Proceedings of the European Conference on Computer Vision (ECCV), pages 189–206. Springer, 2022b.
- 3dstylenet: Creating 3d shapes with geometric and texture style variations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12456–12465, 2021.
- Arf: Artistic radiance fields. In Proceedings of the European Conference on Computer Vision (ECCV), pages 717–733. Springer, 2022.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3836–3847, 2023a.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 586–595, 2018.
- Ref-npr: Reference-based non-photorealistic radiance fields for controllable scene stylization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4242–4251, 2023b.