Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows (2210.11603v2)

Published 20 Oct 2022 in cs.HC, cs.AI, cs.CY, cs.LG, and cs.MM

Abstract: Text-to-image AI are capable of generating novel images for inspiration, but their applications for 3D design workflows and how designers can build 3D models using AI-provided inspiration have not yet been explored. To investigate this, we integrated DALL-E, GPT-3, and CLIP within a CAD software in 3DALL-E, a plugin that generates 2D image inspiration for 3D design. 3DALL-E allows users to construct text and image prompts based on what they are modeling. In a study with 13 designers, we found that designers saw great potential in 3DALL-E within their workflows and could use text-to-image AI to produce reference images, prevent design fixation, and inspire design considerations. We elaborate on prompting patterns observed across 3D modeling tasks and provide measures of prompt complexity observed across participants. From our findings, we discuss how 3DALL-E can merge with existing generative design workflows and propose prompt bibliographies as a form of human-AI design history.

Exploring 3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows

The paper "3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows" presents an innovative approach to employing text-to-image AI in the domain of 3D design. It develops a system named 3DALL-E, which integrates state-of-the-art AI models—DALL-E, GPT-3, and CLIP—within computer-aided design (CAD) software to enhance creativity and efficiency in 3D modeling workflows.

System Overview

3DALL-E functions as a plugin for CAD software, specifically Fusion 360, and facilitates the generation of 2D image inspirations directly related to a designer's active 3D modeling tasks. This system supports designers by converting text and image prompts into meaningful visual stimuli. It employs three primary systems: DALL-E for generating images based on text prompts, GPT-3 for supporting text prompt construction, and CLIP for enhancing prompt relevancy and image generation feedback.

Method and Implementation

The system's architecture allows users to input text and image-based prompts to derive "AI-provided inspiration" through corresponding image generations. Participants in the paper used the plugin to assist in crafting text prompts using design language and in generating image prompts that align with their CAD modeling progress. The comprehensive experimental setup involved tasks where 13 designers were instructed to edit existing models and create new models from scratch, exploring how the integration of 3DALL-E into their workflows could facilitate the design process.

Findings and Implications

Designer Interaction Patterns

The paper identifies three main interaction patterns where text-to-image AI tools can be beneficial:

  1. AI-First: Designers generate AI-assisted inspiration before substantial modeling begins.
  2. AI-Throughout: AI is continuously consulted throughout the design process.
  3. AI-Last: Designers consult AI tools towards the end, to refine and finalize designs.

Each of these patterns represents distinct phases in design workflows where AI integration might enhance productivity or creativity.

Use Cases and Potential Benefits

The research highlights several practical uses of 3DALL-E:

  • Reference Image Generation: AI-generated images serve as reference blueprints for geometric modeling.
  • Preventing Design Fixation: Designers can explore multiple creative avenues through varied AI outputs, avoiding traditional design pitfalls such as fixation.
  • Inspiration Across Disciplines: The plugin was found effective across varying domains, including industrial design and robotics, proving its interdisciplinary utility.

Prompt Bibliographies

A noteworthy contribution is the introduction of "prompt bibliographies," a novel concept recommended for tracking user inspirations and creative interactions with AI. This documentation approach aids in attributive clarity and facilitates the recording of design history, making AI contribution to the creative process transparent.

Future Directions and Challenges

While 3DALL-E shows promise in augmenting CAD workflows, future iterations should address challenges such as enhancing the efficiency of text-to-image conversions and further integrating AI with traditional design software. Additionally, ensuring AI's adaptability to various creative tasks, while safeguarding proprietary design data, remains a compelling area for future exploration.

In summation, 3DALL-E delineates a promising avenue for integrating generative AI into the creative workflows of CAD designers, offering insights into both practical use cases and potential enhancements. This integration has broad implications indicating that AI can play a transformative role in creative sectors, allowing designers to transcend conventional creativity limits through computational collaboration.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Vivian Liu (12 papers)
  2. Jo Vermeulen (6 papers)
  3. George Fitzmaurice (6 papers)
  4. Justin Matejka (7 papers)
Citations (82)
X Twitter Logo Streamline Icon: https://streamlinehq.com