Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NIVeL: Neural Implicit Vector Layers for Text-to-Vector Generation (2405.15217v1)

Published 24 May 2024 in cs.CV and cs.GR

Abstract: The success of denoising diffusion models in representing rich data distributions over 2D raster images has prompted research on extending them to other data representations, such as vector graphics. Unfortunately due to their variable structure and scarcity of vector training data, directly applying diffusion models on this domain remains a challenging problem. Using workarounds like optimization via Score Distillation Sampling (SDS) is also fraught with difficulty, as vector representations are non trivial to directly optimize and tend to result in implausible geometries such as redundant or self-intersecting shapes. NIVeL addresses these challenges by reinterpreting the problem on an alternative, intermediate domain which preserves the desirable properties of vector graphics -- mainly sparsity of representation and resolution-independence. This alternative domain is based on neural implicit fields expressed in a set of decomposable, editable layers. Based on our experiments, NIVeL produces text-to-vector graphics results of significantly better quality than the state-of-the-art.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. openai/clip-vit-large-patch14. https://torchmetrics.readthedocs.io/en/stable/multimodal/clip_score.html#id3.
  2. Deepfloyd/if-i-xl-v1.0. https://huggingface.co/DeepFloyd/IF-I-XL-v1.0.
  3. Svg text prompts dataset. https://ajayj.com/vectorfusion/svg_bench_prompts.txt.
  4. Pierre Bezier. Courbes et surfaces, mathématiques et cao, 4. Hermès, Paris, 1986.
  5. Deepsvg: A hierarchical generative network for vector graphics animation, 2020.
  6. Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8628–8638, 2021.
  7. Chirodiff: Modelling chirographic data with diffusion models, 2023.
  8. Image vectorization and editing via linear gradient layer decomposition. ACM Transactions on Graphics (TOG), 42(4), 2023.
  9. Clipdraw: Exploring text-to-drawing synthesis through language-image encoders. In Advances in Neural Information Processing Systems, 2022.
  10. Synthesizing programs for images using reinforced adversarial learning. CoRR, abs/1804.01118, 2018.
  11. A neural representation of sketch drawings. In International Conference on Learning Representations, 2018.
  12. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  13. Dreamtime: An improved optimization strategy for text-to-3d content creation, 2023.
  14. Inkscape Project. Inkscape.
  15. Scalable vector graphics (svg) full 1.2 specification. World Wide Web Consortium, Working Draft WD-SVG12-20050413, 2005.
  16. Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models. arXiv, 2022.
  17. Differentiable vector graphics rasterization for editing and learning. ACM Trans. Graph. (Proc. SIGGRAPH Asia), 39(6):193:1–193:15, 2020.
  18. A learned representation for scalable vector graphics. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 7929–7938, 2019.
  19. Decoupled weight decay regularization, 2019.
  20. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.
  21. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  22. Thomas Müller. tiny-cuda-nn, 2021.
  23. Diffusion curves: A vector representation for smooth-shaded images. In ACM Transactions on Graphics, 2008.
  24. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 165–174, 2019.
  25. Neural parts: Learning expressive 3d shape abstractions with invertible neural networks, 2021.
  26. Dreamfusion: Text-to-3d using 2d diffusion. arXiv, 2022.
  27. Discovering pattern structure using differentiable compositing. ACM Transactions on Graphics (TOG), 39(6):1–15, 2020.
  28. Im2vec: Synthesizing vector graphics without vector supervision, 2021a.
  29. A multi-implicit neural representation for fonts. arXiv preprint arXiv:2106.06866, 2021b.
  30. Deep parametric shape predictions using distance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  31. Vecfusion: Vector font generation with diffusion, 2023.
  32. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation, 2022.
  33. Sketchknitter: Vectorized sketch generation with diffusion models. In The Eleventh International Conference on Learning Representations, 2023a.
  34. Deepvecfont: Synthesizing high-quality vector fonts via dual-modality learning. ACM Transactions on Graphics, 40(6), 2021.
  35. Deepvecfont-v2: Exploiting transformers to synthesize vector fonts with higher quality. arXiv preprint arXiv:2303.14585, 2023b.
  36. Iconshop: Text-based vector icon synthesis with autoregressive transformers. arXiv preprint arXiv:2304.14400, 2023.
  37. Neural fields in visual computing and beyond. Computer Graphics Forum, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Vikas Thamizharasan (5 papers)
  2. Difan Liu (23 papers)
  3. Matthew Fisher (50 papers)
  4. Nanxuan Zhao (36 papers)
  5. Evangelos Kalogerakis (44 papers)
  6. Michal Lukac (7 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com