Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language and Text-to-3D Models for Engineering Design Optimization (2307.01230v1)

Published 3 Jul 2023 in cs.CL, cs.LG, and cs.NE

Abstract: The current advances in generative AI for learning large neural network models with the capability to produce essays, images, music and even 3D assets from text prompts create opportunities for a manifold of disciplines. In the present paper, we study the potential of deep text-to-3D models in the engineering domain, with focus on the chances and challenges when integrating and interacting with 3D assets in computational simulation-based design optimization. In contrast to traditional design optimization of 3D geometries that often searches for the optimum designs using numerical representations, such as B-Spline surface or deformation parameters in vehicle aerodynamic optimization, natural language challenges the optimization framework by requiring a different interpretation of variation operators while at the same time may ease and motivate the human user interaction. Here, we propose and realize a fully automated evolutionary design optimization framework using Shap-E, a recently published text-to-3D asset network by OpenAI, in the context of aerodynamic vehicle optimization. For representing text prompts in the evolutionary optimization, we evaluate (a) a bag-of-words approach based on prompt templates and Wordnet samples, and (b) a tokenisation approach based on prompt templates and the byte pair encoding method from GPT4. Our main findings from the optimizations indicate that, first, it is important to ensure that the designs generated from prompts are within the object class of application, i.e. diverse and novel designs need to be realistic, and, second, that more research is required to develop methods where the strength of text prompt variations and the resulting variations of the 3D designs share causal relations to some degree to improve the optimization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. Chatterji, A. Chen, K. Creel, J. Q. Davis, D. Demszky, C. Donahue, M. Doumbouya, E. Durmus, S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei-Fei, C. Finn, T. Gale, L. Gillespie, K. Goel, N. Goodman, S. Grossman, N. Guha, T. Hashimoto, P. Henderson, J. Hewitt, D. E. Ho, J. Hong, K. Hsu, J. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri, S. Karamcheti, G. Keeling, F. Khani, O. Khattab, P. W. Koh, M. Krass, R. Krishna, R. Kuditipudi, A. Kumar, F. Ladhak, M. Lee, T. Lee, J. Leskovec, I. Levent, X. L. Li, X. Li, T. Ma, A. Malik, C. D. Manning, S. Mirchandani, E. Mitchell, Z. Munyikwa, S. Nair, A. Narayan, D. Narayanan, B. Newman, A. Nie, J. C. Niebles, H. Nilforoshan, J. Nyarko, G. Ogut, L. Orr, I. Papadimitriou, J. S. Park, C. Piech, E. Portelance, C. Potts, A. Raghunathan, R. Reich, H. Ren, F. Rong, Y. Roohani, C. Ruiz, J. Ryan, C. Ré, D. Sadigh, S. Sagawa, K. Santhanam, A. Shih, K. Srinivasan, A. Tamkin, R. Taori, A. W. Thomas, F. Tramèr, R. E. Wang, W. Wang, B. Wu, J. Wu, Y. Wu, S. M. Xie, M. Yasunaga, J. You, M. Zaharia, M. Zhang, T. Zhang, X. Zhang, Y. Zhang, L. Zheng, K. Zhou, and P. Liang, “On the Opportunities and Risks of Foundation Models,” 2022. [Online]. Available: https://arxiv.org/abs/2108.07258
  2. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI Blog, 2019. [Online]. Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
  3. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-Resolution Image Synthesis with Latent Diffusion Models,” CoRR, vol. abs/2112.10752, 2021. [Online]. Available: https://arxiv.org/abs/2112.10752
  4. S. Menzel and B. Sendhoff, “Representing the Change - Free Form Deformation for Evolutionary Design Optimisation,” in Evolutionary Computation in Practice, T. Yu, D. Davis, C. Baydar, and R. Roy, Eds.   Berlin: Springer, 2008, ch. 4, p. 63–86.
  5. M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst, “Geometric Deep Learning: Going beyond Euclidean Data,” IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 18–42, 2017.
  6. P. Achlioptas, O. Diamanti, I. Mitliagkas, and L. Guibas, “Learning Representations and Generative Models for 3D Point Clouds,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80.   PMLR, 10–15 Jul 2018, pp. 40–49. [Online]. Available: http://proceedings.mlr.press/v80/achlioptas18a.html
  7. H. Jun and A. Nichol, “Shap-E: Generating Conditional 3D Implicit Functions,” 2023. [Online]. Available: https://arxiv.org/abs/2305.02463
  8. T. Rios, B. Van Stein, T. Bäck, B. Sendhoff, and S. Menzel, “Point2FFD: Learning Shape Representations of Simulation-Ready 3D Models for Engineering Design Optimization,” in 2021 International Conference on 3D Vision (3DV), 2021, pp. 1024–1033.
  9. N. Umetani, “Exploring Generative 3D Shapes Using Autoencoder Networks,” in SIGGRAPH Asia 2017 Technical Briefs, ser. SA ’17.   New York, NY, USA: Association for Computing Machinery, 2017. [Online]. Available: https://doi.org/10.1145/3145749.3145758
  10. T. Rios, B. van Stein, P. Wollstadt, T. Bäck, B. Sendhoff, and S. Menzel, “Exploiting Local Geometric Features in Vehicle Design Optimization with 3D Point Cloud Autoencoders,” in 2021 IEEE Congress on Evolutionary Computation (CEC), 2021, pp. 514–521.
  11. T. Rios, B. van Stein, T. Bäck, B. Sendhoff, and S. Menzel, “Multitask Shape Optimization Using a 3-D Point Cloud Autoencoder as Unified Representation,” IEEE Transactions on Evolutionary Computation, vol. 26, no. 2, pp. 206–217, 2022.
  12. S. Saha, S. Menzel, L. L. Minku, X. Yao, B. Sendhoff, and P. Wollstadt, “Quantifying The Generative Capabilities Of Variational Autoencoders For 3D Car Point Clouds,” in 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020, pp. 1469–1477.
  13. S. Saha, T. Rios, L. L. Minku, B. V. Stein, P. Wollstadt, X. Yao, T. Bäck, B. Sendhoff, and S. Menzel, “Exploiting Generative Models for Performance Predictions of 3D Car Designs,” in 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021, pp. 1–9.
  14. A. Nichol, H. Jun, P. Dhariwal, P. Mishkin, and M. Chen, “Point-e: A system for generating 3d point clouds from complex prompts,” 2022.
  15. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models,” in Proceedings of the 39th International Conference on Machine Learning, PMLR 162, 2022, pp. 16 784–16 804.
  16. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, “Zero-Shot Text-to-Image Generation,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139.   PMLR, 18–24 Jul 2021, pp. 8821–8831. [Online]. Available: https://proceedings.mlr.press/v139/ramesh21a.html
  17. B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, “DreamFusion: Text-to-3D using 2D Diffusion,” arXiv, 2022. [Online]. Available: https://arxiv.org/abs/2209.14988
  18. C.-H. Lin, J. Gao, L. Tang, T. Takikawa, X. Zeng, X. Huang, K. Kreis, S. Fidler, M.-Y. Liu, and T.-Y. Lin, “Magic3D: High-Resolution Text-to-3D Content Creation,” 2023. [Online]. Available: https://arxiv.org/abs/2211.10440
  19. J. Gao, T. Shen, Z. Wang, W. Chen, K. Yin, D. Li, O. Litany, Z. Gojcic, and S. Fidler, “GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images,” 2022. [Online]. Available: https://arxiv.org/abs/2209.11163
  20. M. A. Bautista, P. Guo, S. Abnar, W. Talbott, A. Toshev, Z. Chen, L. Dinh, S. Zhai, H. Goh, D. Ulbricht, A. Dehghan, and J. Susskind, “GAUDI: A Neural Architect for Immersive 3D Scene Generation,” 2022. [Online]. Available: https://arxiv.org/abs/2207.13751
  21. L. Reynolds and K. McDonnell, “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm,” in CHI EA ’21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021, pp. 1–7.
  22. T. Martins, J. M. Cunha, J. Correia, and P. Machado, “Towards the evolution of prompts with metaprompter,” in Artificial Intelligence in Music, Sound, Art and Design, C. Johnson, N. Rodríguez-Fernández, and S. M. Rebelo, Eds.   Cham: Springer Nature Switzerland, 2023, pp. 180–195.
  23. M. Wong, Y.-S. Ong, A. Gupta, K. K. Bali, and C. Chen, “Prompt Evolution for Generative AI: A Classifier-Guided Approach,” 2023. [Online]. Available: https://arxiv.org/abs/2305.16347
  24. N. Arechiga, F. Permenter, B. Song, and C. Yuan, “Drag-Guided Diffusion Models for Vehicle Image Generation,” 2023. [Online]. Available: https://arxiv.org/abs/2306.09935
  25. Y. Jin, M. Olhofer, and B. Sendhoff, “A framework for evolutionary optimization with approximate fitness functions,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 5, p. 481–494, 2002. [Online]. Available: documents/Jin_Tec02.pdf
  26. Z. Wu and M. Palmer, “Verb semantics and lexical selection,” in 32nd Annual Meeting of the Association for Computational Linguistics.   Las Cruces, New Mexico, USA: Association for Computational Linguistics, Jun. 1994, pp. 133–138. [Online]. Available: https://aclanthology.org/P94-1019
  27. OpenAI, “GPT-4 Technical Report,” 2023. [Online]. Available: https://arxiv.org/abs/2303.08774
  28. N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary Computation, vol. 9, no. 2, pp. 159–195, 2001.
  29. T. Bäck and H. Schwefel, “Evolutionary Computation: An Overview,” in Proceedings of IEEE International Conference on Evolutionary Computation, 1996, pp. 20–29.
  30. P. Dubey, M. Y. Pramod, A. S. Kumar, and B. T. Kannan, “Numerical simulation of flow over a racing motorbike using OpenFOAM®,” AIP Conference Proceedings, vol. 2277, no. 1, 11 2020, 030029. [Online]. Available: https://doi.org/10.1063/5.0025199
  31. H. Fan, H. Su, and L. Guibas, “A point set generation network for 3d object reconstruction from a single image,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   Los Alamitos, CA, USA: IEEE Computer Society, jul 2017, pp. 2463–2471. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/CVPR.2017.264
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Thiago Rios (6 papers)
  2. Stefan Menzel (14 papers)
  3. Bernhard Sendhoff (4 papers)
Citations (7)