From Text to Blueprint: Leveraging Text-to-Image Tools for Floor Plan Creation (2405.17236v1)
Abstract: Artificial intelligence is revolutionizing architecture through text-to-image synthesis, converting textual descriptions into detailed visual representations. We explore AI-assisted floor plan design, focusing on technical background, practical methods, and future directions. Using tools like, Stable Diffusion, AI leverages models such as Generative Adversarial Networks and Variational Autoencoders to generate complex and functional floorplans designs. We evaluates these AI models' effectiveness in generating residential floor plans from text prompts. Through experiments with reference images, text prompts, and sketches, we assess the strengths and limitations of current text-to-image technology in architectural visualization. Architects can use these AI tools to streamline design processes, create multiple design options, and enhance creativity and collaboration. We highlight AI's potential to drive smarter, more efficient floorplan design, contributing to ongoing discussions on AI integration in the design profession and its future impact.
- R. H. McGuire and M. B. Schiffer, “A theory of architectural design,” Journal of anthropological archaeology, vol. 2, no. 3, pp. 277–303, 1983.
- D. Garlan, R. Allen, and J. Ockerbloom, “Exploiting style in architectural design environments,” ACM SIGSOFT software engineering notes, vol. 19, no. 5, pp. 175–188, 1994.
- V. Machairas, A. Tsangrassoulis, and K. Axarli, “Algorithms for optimization of building design: A review,” Renewable and sustainable energy reviews, vol. 31, pp. 101–112, 2014.
- O. O. Demirbaş and H. Demirkan, “Focus on architectural design process through learning styles,” Design studies, vol. 24, no. 5, pp. 437–456, 2003.
- I. Caetano, L. Santos, and A. Leitão, “Computational design in architecture: Defining parametric, generative, and algorithmic design,” Frontiers of Architectural Research, vol. 9, no. 2, pp. 287–300, 2020.
- X. Xu, I. Weber, M. Staples, L. Zhu, J. Bosch, L. Bass, C. Pautasso, and P. Rimba, “A taxonomy of blockchain-based systems for architecture design,” in 2017 IEEE international conference on software architecture (ICSA). IEEE, 2017, pp. 243–252.
- T. Kotnik, “Digital architectural design as exploration of computable functions,” International journal of architectural computing, vol. 8, no. 1, pp. 1–16, 2010.
- A. Hollberg and J. Ruth, “Lca in architectural design—a parametric approach,” The International Journal of Life Cycle Assessment, vol. 21, pp. 943–960, 2016.
- D. Aliakseyeu, J.-B. Martens, and M. Rauterberg, “A computer support tool for the early stages of architectural design,” Interacting with Computers, vol. 18, no. 4, pp. 528–555, 2006.
- Ö. Akin and C. Akin, “Frames of reference in architectural design: analysing the hyperacclamation (aha-!),” Design studies, vol. 17, no. 4, pp. 341–361, 1996.
- K. J. Lomas, “Architectural design of an advanced naturally ventilated building form,” Energy and Buildings, vol. 39, no. 2, pp. 166–181, 2007.
- J. W. Rae, S. Borgeaud, T. Cai, K. Millican, J. Hoffmann, F. Song, J. Aslanides, S. Henderson, R. Ring, S. Young et al., “Scaling language models: Methods, analysis & insights from training gopher,” arXiv preprint arXiv:2112.11446, 2021.
- J. Xu, S. D. Mello, S. Liu, W. Byeon, T. Breuel, J. Kautz, and X. Wang, “Groupvit: Semantic segmentation emerges from text supervision,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18 113–18 123, 2022.
- P. Li and Z. Li, “Efficient temporal denoising for improved depth map applications,” in Proc. Int. Conf. Learn. Representations, Tiny papers, 2023.
- R. Thoppilan, D. De Freitas, J. Hall, N. Shazeer, A. Kulshreshtha, H.-T. Cheng, A. Jin, T. Bos, L. Baker, Y. Du et al., “LaMDA: Language models for dialog applications,” arXiv preprint arXiv:2201.08239, 2022.
- T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” in Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, pp. 1877–1901.
- H. Liu, D. Tam, M. Mohammed, J. Mohta, T. Huang, M. Bansal, and C. Raffel, “Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning,” in Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022.
- T. Brooks, A. Holynski, and A. A. Efros, “Instructpix2pix: Learning to follow image editing instructions,” in CVPR, 2023.
- A. Hertz, R. Mokady, J. Tenenbaum, K. Aberman, Y. Pritch, and D. Cohen-Or, “Prompt-to-prompt image editing with cross attention control,” in ICLR, 2023.
- P. Li, Q. Huang, Y. Ding, and Z. Li, “Layerdiffusion: Layered controlled image editing with diffusion models,” in SIGGRAPH Asia 2023 Technical Communications, 2023, pp. 1–4.
- P. Li, Q. Nie, Y. Chen, X. Jiang, K. Wu, Y. Lin, Y. Liu, J. Peng, C. Wang, and F. Zheng, “Tuning-free image customization with image and text guidance,” arXiv preprint arXiv:2403.12658, 2024.
- C. Meng, Y. Song, J. Song, J. Wu, J.-Y. Zhu, and S. Ermon, “Sdedit: Image synthesis and editing with stochastic differential equations,” in ICLR, 2022.
- N. Tumanyan, M. Geyer, S. Bagon, and T. Dekel, “Plug-and-play diffusion features for text-driven image-to-image translation,” in CVPR, 2023.
- Midjourney. (2022) Midjourney. [Online]. Available: https://www.midjourney.com/
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in CVPR, 2022.
- A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv:2204.06125, 2022.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in NeurIPS, 2014.
- A. Munoz, M. Zolfaghari, M. Argus, and T. Brox, “Temporal shift gan for large scale video generation,” in WACV, 2021.
- A. Van Den Oord, O. Vinyals et al., “Neural discrete representation learning,” NeurIPS, 2017.
- J. Ho, W. Chan, C. Saharia, J. Whang, R. Gao, A. Gritsenko, D. P. Kingma, B. Poole, M. Norouzi, D. J. Fleet et al., “Imagen video: High definition video generation with diffusion models,” arXiv:2210.02303, 2022.
- Z. Luo, D. Chen, Y. Zhang, Y. Huang, L. Wang, Y. Shen, D. Zhao, J. Zhou, and T. Tan, “Videofusion: Decomposed diffusion models for high-quality video generation,” in CVPR, 2023.
- P. Li, C. Tang, Q. Huang, and Z. Li, “Art3d: 3d gaussian splatting for text-guided artistic scenes generation,” arXiv:2405.10508, 2024.
- C.-H. Lin, J. Gao, L. Tang, T. Takikawa, X. Zeng, X. Huang, K. Kreis, S. Fidler, M.-Y. Liu, and T.-Y. Lin, “Magic3d: High-resolution text-to-3d content creation,” in CVPR, 2023.
- B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, “Dreamfusion: Text-to-3d using 2d diffusion,” arXiv preprint arXiv:2209.14988, 2022.
- S. Luo and W. Hu, “Diffusion probabilistic models for 3d point cloud generation,” in CVPR, 2021.
- J. Wang and X. Zhang, “Exploring text-based realistic building facades editing applicaiton,” arXiv preprint arXiv:2405.02967, 2024.
- P. Li and B. Li, “Generating daylight-driven architectural design via diffusion models,” arXiv preprint arXiv:2404.13353, 2024.
- P. Li, B. Li, and Z. Li, “Sketch-to-architecture: Generative ai-aided architectural design,” in Proceedings of the 31st Pacific Conference on Computer Graphics and Applications. The Eurographics Association, 2023.
- S. Chaillou, “Archigan: Artificial intelligence x architecture,” in Architectural intelligence: Selected papers from the 1st international conference on computational design and robotic fabrication (CDRF 2019). Springer, 2020, pp. 117–127.
- W. R. Para, S. Bhat, P. Guerrero, T. Kelly, N. J. Mitra, L. J. Guibas, and P. Wonka, “Sketchgen: Generating constrained cad sketches,” in Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021.
- X. Zhang and W. Liu, “Boosting architectural generation via prompts: Report,” arXiv preprint arXiv:2404.15971, 2024.
- OpenAI, “Gpt-4 technical report,” 2023.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” in ICLR, 2022.
- Xiaoyu Li (348 papers)
- Jonathan Benjamin (8 papers)
- Xin Zhang (904 papers)