Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sketch-to-Architecture: Generative AI-aided Architectural Design (2403.20186v1)

Published 29 Mar 2024 in cs.CV

Abstract: Recently, the development of large-scale models has paved the way for various interdisciplinary research, including architecture. By using generative AI, we present a novel workflow that utilizes AI models to generate conceptual floorplans and 3D models from simple sketches, enabling rapid ideation and controlled generation of architectural renderings based on textual descriptions. Our work demonstrates the potential of generative AI in the architectural design process, pointing towards a new direction of computer-aided architectural design. Our project website is available at: https://zrealli.github.io/sketch2arc

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Blended latent diffusion. ACM Transactions on Graphics (TOG) 42, 4 (2023), 1–11.
  2. Artificial intelligence in architecture: Generating conceptual design via deep learning. International Journal of Architectural Computing 16, 4 (2018), 306–327.
  3. Image2stylegan: How to embed images into the stylegan latent space? In Proceedings of the IEEE/CVF international conference on computer vision (2019), pp. 4432–4441.
  4. Generative systems in the architecture, engineering and construction industry: A systematic review and analysis. International Journal of Architectural Computing 19, 3 (2021), 226–249.
  5. Artificial intelligence and smart vision for building and construction 4.0: Machine and deep learning methods and applications. Automation in Construction 141 (2022), 104440.
  6. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618 (2022).
  7. Architext: Language-driven generative architecture design. arXiv preprint arXiv:2303.07519 (2023).
  8. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851.
  9. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
  10. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2019), pp. 4401–4410.
  11. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020), pp. 8110–8119.
  12. Layerdiffusion: Layered controlled image editing with diffusion models. In SIGGRAPH Asia 2023 Technical Communications. 2023, pp. 1–4.
  13. Monedero J.: Parametric design: a review and some experiences. Automation in construction 9, 4 (2000), 369–377.
  14. House-gan: Relational generative adversarial networks for graph-constrained house layout generation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16 (2020), Springer, pp. 162–177.
  15. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2022), pp. 10684–10695.
  16. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence 44, 3 (2020), 1623–1637.
  17. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), pp. 22500–22510.
  18. Schnabel M. A.: Parametric designing in architecture. In Computer-aided architectural design futures (CAADFutures) 2007: proceedings of the 12th international CAADFutures conference (2007), Springer, pp. 237–250.
  19. Photorealistic text-to-image diffusion models with deep language understanding. Advances in neural information processing systems 35 (2022), 36479–36494.
  20. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020).
  21. Dalle-urban: Capturing the urban design expertise of large text to image transformers. In 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) (2022), IEEE, pp. 1–9.
  22. Attention is all you need. Advances in neural information processing systems 30 (2017).
  23. Zhang L., Agrawala M.: Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Pengzhi Li (7 papers)
  2. Baijuan Li (2 papers)
  3. Zhiheng Li (67 papers)
Citations (13)
Youtube Logo Streamline Icon: https://streamlinehq.com