StylizedGS: Controllable Stylization for 3D Gaussian Splatting (2404.05220v2)
Abstract: As XR technology continues to advance rapidly, 3D generation and editing are increasingly crucial. Among these, stylization plays a key role in enhancing the appearance of 3D models. By utilizing stylization, users can achieve consistent artistic effects in 3D editing using a single reference style image, making it a user-friendly editing method. However, recent NeRF-based 3D stylization methods encounter efficiency issues that impact the user experience, and their implicit nature limits their ability to accurately transfer geometric pattern styles. Additionally, the ability for artists to apply flexible control over stylized scenes is considered highly desirable to foster an environment conducive to creative exploration. To address the above issues, we introduce StylizedGS, an efficient 3D neural style transfer framework with adaptable control over perceptual factors based on 3D Gaussian Splatting (3DGS) representation. We propose a filter-based refinement to eliminate floaters that affect the stylization effects in the scene reconstruction process. The nearest neighbor-based style loss is introduced to achieve stylization by fine-tuning the geometry and color parameters of 3DGS, while a depth preservation loss with other regularizations is proposed to prevent the tampering of geometry content. Moreover, facilitated by specially designed losses, StylizedGS enables users to control color, stylized scale, and regions during the stylization to possess customization capabilities. Our method achieves high-quality stylization results characterized by faithful brushstrokes and geometric consistency with flexible controls. Extensive experiments across various scenes and styles demonstrate the effectiveness and efficiency of our method concerning both stylization quality and inference speed.
- H. Kato, Y. Ushiku, and T. Harada, “Neural 3d mesh renderer,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- O. Michel, R. Bar-On, R. Liu, S. Benaim, and R. Hanocka, “Text2mesh: Text-driven neural stylization for meshes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 13 492–13 502.
- K. Yin, J. Gao, M. Shugrina, S. Khamis, and S. Fidler, “3dstylenet: Creating 3d shapes with geometric and texture style variations,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 456–12 465.
- J. Guo, M. Li, Z. Zong, Y. Liu, J. He, Y. Guo, and L.-Q. Yan, “Volumetric appearance stylization with stylizing kernel prediction network.” ACM Trans. Graph., vol. 40, no. 4, pp. 162–1, 2021.
- O. Klehm, I. Ihrke, H.-P. Seidel, and E. Eisemann, “Property and lighting manipulations for static volume stylization using a painting metaphor,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 7, pp. 983–995, 2014.
- X. Cao, W. Wang, K. Nagao, and R. Nakamura, “Psnet: A style transfer network for point cloud stylization on geometry and color,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer vision, 2020, pp. 3337–3345.
- H.-P. Huang, H.-Y. Tseng, S. Saini, M. Singh, and M.-H. Yang, “Learning to stylize novel views,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 13 869–13 878.
- E. Bae, J. Kim, and S. Lee, “Point cloud-based free viewpoint artistic style transfer,” in 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE, 2023, pp. 302–307.
- P.-Z. Chiang, M.-S. Tsai, H.-Y. Tseng, W.-S. Lai, and W.-C. Chiu, “Stylizing 3d scene via implicit representation and hypernetwork,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 1475–1484.
- T. Nguyen-Phuoc, F. Liu, and L. Xiao, “Snerf: stylized neural implicit representations for 3d scenes,” arXiv preprint arXiv:2207.02363, 2022.
- Z. Fan, Y. Jiang, P. Wang, X. Gong, D. Xu, and Z. Wang, “Unified implicit neural stylization,” in European Conference on Computer Vision. Springer, 2022, pp. 636–654.
- Y.-H. Huang, Y. He, Y.-J. Yuan, Y.-K. Lai, and L. Gao, “Stylizednerf: consistent 3d scene stylization as stylized nerf via 2d-3d mutual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 342–18 352.
- K. Zhang, N. Kolkin, S. Bi, F. Luan, Z. Xu, E. Shechtman, and N. Snavely, “Arf: Artistic radiance fields,” 2022.
- C. Wang, R. Jiang, M. Chai, M. He, D. Chen, and J. Liao, “Nerf-art: Text-driven neural radiance fields stylization,” IEEE Transactions on Visualization and Computer Graphics, 2023.
- H.-W. Pang, B.-S. Hua, and S.-K. Yeung, “Locally stylized neural radiance fields,” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, 2023, pp. 307–316.
- Z. Zhang, Y. Liu, C. Han, Y. Pan, T. Guo, and T. Yao, “Transforming radiance field with lipschitz network for photorealistic 3d scene stylization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20 712–20 721.
- B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, July 2023. [Online]. Available: https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
- M. Ye, M. Danelljan, F. Yu, and L. Ke, “Gaussian grouping: Segment and edit anything in 3d scenes,” arXiv preprint arXiv:2312.00732, 2023.
- J. Fang, J. Wang, X. Zhang, L. Xie, and Q. Tian, “Gaussianeditor: Editing 3d gaussians delicately with text instructions,” arXiv preprint arXiv:2311.16037, 2023.
- Y. Chen, Z. Chen, C. Zhang, F. Wang, X. Yang, Y. Wang, Z. Cai, L. Yang, H. Liu, and G. Lin, “Gaussianeditor: Swift and controllable 3d editing with gaussian splatting,” 2023.
- J. Tang, J. Ren, H. Zhou, Z. Liu, and G. Zeng, “Dreamgaussian: Generative gaussian splatting for efficient 3d content creation,” arXiv preprint arXiv:2309.16653, 2023.
- L. A. Gatys, A. S. Ecker, M. Bethge, A. Hertzmann, and E. Shechtman, “Controlling perceptual factors in neural style transfer,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3985–3993.
- Y. Jing, Y. Liu, Y. Yang, Z. Feng, Y. Yu, D. Tao, and M. Song, “Stroke controllable fast style transfer with adaptive receptive fields,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 238–254.
- C. Castillo, S. De, X. Han, B. Singh, A. K. Yadav, and T. Goldstein, “Son of zorn’s lemma: Targeted style transfer using instance-aware semantic segmentation,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017, pp. 1348–1352.
- W. Li, T. Wu, F. Zhong, and C. Oztireli, “Arf-plus: Controlling perceptual factors in artistic radiance fields for 3d scene stylization,” arXiv preprint arXiv:2308.12452, 2023.
- L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,” arXiv preprint arXiv:1508.06576, 2015.
- ——, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2414–2423.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- E. Risser, P. Wilmot, and C. Barnes, “Stable and controllable neural texture synthesis and style transfer using histogram losses,” arXiv preprint arXiv:1701.08893, 2017.
- S. Gu, C. Chen, J. Liao, and L. Yuan, “Arbitrary style transfer with deep feature reshuffle,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8222–8231.
- N. Kolkin, J. Salavon, and G. Shakhnarovich, “Style transfer by relaxed optimal transport and self-similarity,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 051–10 060.
- J. Liao, Y. Yao, L. Yuan, G. Hua, and S. B. Kang, “Visual attribute transfer through deep image analogy,” arXiv preprint arXiv:1705.01088, 2017.
- J. An, S. Huang, Y. Song, D. Dou, W. Liu, and J. Luo, “Artflow: Unbiased image style transfer via reversible neural flows,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 862–871.
- X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1501–1510.
- D. Y. Park and K. H. Lee, “Arbitrary style transfer with style-attentional networks,” in proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5880–5888.
- T. Q. Chen and M. Schmidt, “Fast patch-based style transfer of arbitrary style,” arXiv preprint arXiv:1612.04337, 2016.
- C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, “Patchmatch: A randomized correspondence algorithm for structural image editing,” ACM Trans. Graph., vol. 28, no. 3, p. 24, 2009.
- J. Cen, J. Fang, C. Yang, L. Xie, X. Zhang, W. Shen, and Q. Tian, “Segment any 3d gaussians,” arXiv preprint arXiv:2312.00860, 2023.
- B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
- S. Xu, L. Li, L. Shen, and Z. Lian, “Desrf: Deformable stylized radiance field,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 709–718.
- M. Kumar, N. Panse, and D. Lahiri, “S2rf: Semantically stylized radiance fields,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 2952–2957.
- Y. Zhang, Z. He, J. Xing, X. Yao, and J. Jia, “Ref-npr: Reference-based non-photorealistic radiance fields for controllable scene stylization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 4242–4251.
- N. Kolkin, M. Kucera, S. Paris, D. Sykora, E. Shechtman, and G. Shakhnarovich, “Neural neighbor style transfer,” arXiv e-prints, pp. arXiv–2203, 2022.
- A. H. C. J. N. Oliver, B. Curless, and D. Salesin, “Image analogies,” in Proc. SIGGRAPH 2001, 2001, pp. 327–340.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment anything,” arXiv:2304.02643, 2023.
- B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, and A. Kar, “Local light field fusion: Practical view synthesis with prescriptive sampling guidelines,” ACM Transactions on Graphics (TOG), 2019.
- A. Knapitsch, J. Park, Q.-Y. Zhou, and V. Koltun, “Tanks and temples: Benchmarking large-scale scene reconstruction,” ACM Transactions on Graphics (ToG), vol. 36, no. 4, pp. 1–13, 2017.
- K. Liu, F. Zhan, Y. Chen, J. Zhang, Y. Yu, A. El Saddik, S. Lu, and E. P. Xing, “Stylerf: Zero-shot 3d style transfer of neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 8338–8348.
- T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- M. Wright and B. Ommer, “Artfid: Quantitative evaluation of neural style transfer,” in DAGM German Conference on Pattern Recognition. Springer, 2022, pp. 560–576.
- S. Huang, J. An, D. Wei, J. Luo, and H. Pfister, “Quantart: Quantizing image style transfer towards high visual fidelity,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5947–5956.
- A. Gu’edon and V. Lepetit, “Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering,” ArXiv, vol. abs/2311.12775, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:265308825