Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Content-Centric Prototyping of Generative AI Applications: Emerging Approaches and Challenges in Collaborative Software Teams (2402.17721v1)

Published 27 Feb 2024 in cs.HC and cs.SE

Abstract: Generative AI models are increasingly powering software applications, offering the capability to produce expressive content across varied contexts. However, unlike previous iterations of human-AI design, the emerging design process for generative capabilities primarily hinges on prompt engineering strategies. Given this fundamental shift in approach, our work aims to understand how collaborative software teams set up and apply design guidelines and values, iteratively prototype prompts, and evaluate prompts to achieve desired outcomes. We conducted design studies with 39 industry professionals, including designers, software engineers, and product managers. Our findings reveal a content-centric prototyping approach in which teams begin with the content they want to generate, then identify specific attributes, constraints, and values, and explore methods to give users the ability to influence and interact with those attributes. Based on associated challenges, such as the lack of model interpretability and overfitting the design to examples, we outline considerations for generative AI prototyping.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. Software engineering for machine learning: a case study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. IEEE Press, 291–300.
  2. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 3.
  3. ChainForge: An open-source visual programming environment for prompt engineering. In Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–3.
  4. Michel Beaudouin-Lafon and Wendy E Mackay. 2009. Prototyping tools and techniques. In Human-Computer Interaction. CRC Press, 137–160.
  5. Paul Beynon-Davies and Steve Holmes. 2002. Design breakdowns, scenarios and rapid application development. Information and software technology 44, 10 (2002), 579–592.
  6. Explainable machine learning in deployment. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 648–657.
  7. Memoing in qualitative research: Probing data and processes. Journal of research in nursing 13, 1 (2008), 68–75.
  8. W Bischofberger. 1996. User interface prototyping-concepts, tools, and experience. In Proceedings of IEEE 18th International Conference on Software Engineering. IEEE, 532–541.
  9. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021).
  10. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs.CL]
  11. When do you need Chain-of-Thought Prompting for ChatGPT? arXiv:2304.03262 [cs.AI]
  12. Next Steps for Human-Centered Generative AI: A Technical Perspective. arXiv preprint arXiv:2306.15774 (2023).
  13. Herbert H Clark and Susan E Brennan. 1991. Grounding in communication. (1991).
  14. Multi-fidelity prototyping of user interfaces. In IFIP Conference on Human-Computer Interaction. Springer, 150–164.
  15. The future of human-ai collaboration: a taxonomy of design knowledge for hybrid intelligence systems. (2019).
  16. Ux design innovation: Challenges for working with machine learning as a design material. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 278–288.
  17. Canvil: Designerly Adaptation for LLM-Powered User Experiences. arXiv preprint arXiv:2401.09051 (2024).
  18. Gerhard Fischer. 2000. Symmetry of ignorance, social creativity, and meta-design. Knowledge-Based Systems 13, 7-8 (2000), 527–537.
  19. Complexity-Based Prompting for Multi-Step Reasoning. arXiv:2210.00720 [cs.CL]
  20. Fabien Girardin and Neal Lathia. 2017. When User Experience Designers Partner with Data Scientists. In 2017 AAAI Spring Symposium Series.
  21. Google. 2019. People + AI Guidebook. https://pair.withgoogle.com/
  22. Google. 2023. Bard. https://bard.google.com/
  23. Google. 2024. MakerSuite. https://makersuite.google.com/
  24. Breakdowns and processes during the early activities of software design by professionals. In Empirical studies of programmers: Second Workshop. 65–82.
  25. Thilo Hagendorff. 2019. The ethics of AI ethics–an evaluation of guidelines. arXiv preprint arXiv:1903.03425 (2019).
  26. Jeffrey Heer. 2019. Agency plus automation: Designing artificial intelligence into interactive systems. Proceedings of the National Academy of Sciences 116, 6 (2019), 1844–1850.
  27. Design Methods to Investigate User Experiences of Artificial Intelligence. In 2018 AAAI Spring Symposium Series.
  28. Human factors in model interpretability: Industry practices, challenges, and needs. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1–26.
  29. Azumio Inc. 2019. Calorie Mama Food AI: Instant food Recognition and Calorie Counter using Deep Learning. http://www.caloriemama.ai/
  30. Adobe Inc. 2024. Adobe Firefly. https://www.adobe.com/products/firefly.html
  31. DataRobot Inc. 2023a. DataRobot. https://www.datarobot.com/
  32. Jasper AI Inc. 2023b. Jasper. https://www.jasper.ai/
  33. Missive App Inc. 2023c. Missive. https://missiveapp.com/
  34. Tabnine Inc. 2023d. Tabnine. https://www.tabnine.com/
  35. Public Health Calls for/with AI: An Ethnographic Perspective. Proceedings of the ACM on Human-Computer Interaction 7, CSCW2 (2023), 1–26.
  36. Promptmaker: Prompt-based prototyping with large language models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–8.
  37. The global landscape of AI ethics guidelines. Nature Machine Intelligence 1, 9 (2019), 389–399.
  38. Challenges and applications of large language models. arXiv preprint arXiv:2307.10169 (2023).
  39. Scaling Laws for Neural Language Models. arXiv:2001.08361 [cs.LG]
  40. Large Language Models are Zero-Shot Reasoners. arXiv:2205.11916 [cs.CL]
  41. Creative Data Work in the Design Process. In Proceedings of the 2019 on Creativity and Cognition. ACM, 346–358.
  42. Human few-shot learning of compositional instructions. arXiv:1901.04587 [cs.CL]
  43. Building Machines That Learn and Think Like People. arXiv:1604.00289 [cs.AI]
  44. Can language models learn from explanations in context? arXiv:2204.02329 [cs.CL]
  45. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15.
  46. Designerly understanding: Information needs for model transparency to support design ideation for AI-powered user experience. In Proceedings of the 2023 CHI conference on human factors in computing systems. 1–21.
  47. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. arXiv:2107.13586 [cs.CL]
  48. Tania Lombrozo and Susan Carey. 2006. Functional explanation and the function of explanation. Cognition 99, 2 (2006), 167–204. https://doi.org/10.1016/j.cognition.2004.12.009
  49. Designing and Prototyping from the Perspective of AI in the Wild. In Proceedings of the 2019 on Designing Interactive Systems Conference. ACM, 1083–1088.
  50. Reliability and Inter-Rater Reliability in Qualitative Research: Norms and Guidelines for CSCW and HCI Practice. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 72 (Nov. 2019), 23 pages. https://doi.org/10.1145/3359174
  51. fAIlureNotes: Supporting Designers in Understanding the Limits of AI Models for Computer Vision Tasks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–19.
  52. The design space of generative models. arXiv preprint arXiv:2304.10547 (2023).
  53. OpenAI. 2023. ChatGPT. https://chat.openai.com/chat
  54. PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers’ Workflows. arXiv preprint arXiv:2310.15435 (2023).
  55. Measuring and Narrowing the Compositionality Gap in Language Models. arXiv:2210.03350 [cs.CL]
  56. Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models. arXiv:2209.08141 [cs.CL]
  57. Evani Radiya-Dixit and Xin Wang. 2020. How fine can fine-tuning be? learning efficient language models. In International Conference on Artificial Intelligence and Statistics. PMLR, 2435–2443.
  58. Prototypes as boundary objects in innovation processes. In Proceedings of the 2012 International Conference on Design Research Society, Bangkok, Thailand.
  59. Multitask Prompted Training Enables Zero-Shot Task Generalization. arXiv:2110.08207 [cs.LG]
  60. Human-centered software engineering-integrating usability in the software development lifecycle. Vol. 8. Springer Science & Business Media.
  61. Robert Stalnaker. 2002. Common ground. Linguistics and philosophy 25, 5/6 (2002), 701–721.
  62. Keith E. Stanovich and Richard F. West. 2000. Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences 23, 5 (oct 2000), 645–665. https://doi.org/10.1017/s0140525x00003435
  63. Anselm Strauss and Juliet Corbin. 1990. Basics of qualitative research. Sage publications.
  64. Solving separation-of-concerns problems in collaborative design of human-AI systems through leaky abstractions. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–21.
  65. Bridging the Gulf of Envisioning: Cognitive Design Challenges in LLM Interfaces. arXiv preprint arXiv:2309.14459 (2023).
  66. ProtoAI: Model-Informed Prototyping for AI-Powered Interfaces. In 26th International Conference on Intelligent User Interfaces. 48–58.
  67. Towards a process model for co-creating AI experiences. In Designing Interactive Systems Conference 2021. 1529–1543.
  68. AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support. arXiv preprint arXiv:2311.00710 (2023).
  69. When is machine learning data good?: Valuing in public health datafication. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–16.
  70. Anna Vallgårda and Johan Redström. 2007. Computational composites. In Proceedings of the SIGCHI conference on Human factors in computing systems. 513–522.
  71. Philip van Allen. 2018. Prototyping ways of prototyping AI. interactions 25, 6 (2018), 46–51.
  72. Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters. arXiv:2212.10001 [cs.CL]
  73. Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–15.
  74. Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv:2203.11171 [cs.CL]
  75. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 24824–24837. https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
  76. Toward general design principles for generative AI applications. arXiv preprint arXiv:2301.05578 (2023).
  77. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv:2302.11382 [cs.SE]
  78. Understanding computers and cognition: A new foundation for design. Intellect Books.
  79. Qian Yang. 2018. Machine Learning as a UX Design Material: How Can We Imagine Beyond Automation, Recommenders, and Reminders?. In 2018 AAAI Spring Symposium Series.
  80. Sketching NLP: A Case Study of Exploring the Right Things To Design with Language Intelligence. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 185.
  81. Investigating how experienced UX designers effectively work with machine learning. In Proceedings of the 2018 Designing Interactive Systems Conference. ACM, 585–596.
  82. Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design. In Proceedings of the 2020 chi conference on human factors in computing systems. 1–13.
  83. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arXiv:2305.10601 [cs.CL]
  84. How Experienced Designers of Enterprise Applications Engage AI as a Design Material. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–13.
  85. Herding AI Cats: Lessons from Designing a Chatbot by Prompting GPT-3. (2023).
  86. Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–21.
  87. Sabah Zdanowska and Alex S Taylor. 2022. A study of UX practitioners roles in designing real-world, enterprise ML systems. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–15.
  88. How do data science workers collaborate? roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1–23.
  89. Explainability for Large Language Models: A Survey. arXiv preprint arXiv:2309.01029 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hari Subramonyam (11 papers)
  2. Divy Thakkar (5 papers)
  3. Jürgen Dieber (2 papers)
  4. Anoop Sinha (3 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com