Papers
Topics
Authors
Recent
2000 character limit reached

Survey of User Interface Design and Interaction Techniques in Generative AI Applications (2410.22370v1)

Published 28 Oct 2024 in cs.HC, cs.AI, cs.CL, and cs.LG

Abstract: The applications of generative AI have become extremely impressive, and the interplay between users and AI is even more so. Current human-AI interaction literature has taken a broad look at how humans interact with generative AI, but it lacks specificity regarding the user interface designs and patterns used to create these applications. Therefore, we present a survey that comprehensively presents taxonomies of how a human interacts with AI and the user interaction patterns designed to meet the needs of a variety of relevant use cases. We focus primarily on user-guided interactions, surveying interactions that are initiated by the user and do not include any implicit signals given by the user. With this survey, we aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike. In doing so, we also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (183)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Forest Agostinelli et al. Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325, 2023. URL https://google-research.github.io/seanet/musiclm/examples/.
  3. Exploring the relationship between web accessibility and user experience. International Journal of Human-Computer Studies, 91:13–23, 2016.
  4. Deepwriting: Making digital ink editable via deep generative modeling. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp.  1–14, 2018. doi: 10.1145/3173574.3173708.
  5. Flamingo: a visual language model for few-shot learning. Advances in neural information processing systems, 35:23716–23736, 2022.
  6. A usability study of taxonomy visualisation user interfaces in digital repositories. Online Information Review, 38(2):284–304, 2014.
  7. Clipface: Text-guided editing of textured 3d morphable models. In ACM SIGGRAPH 2023 Conference Proceedings, pp.  1–11, 2023.
  8. Socratic question generation: A novel dataset, models, and evaluation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp.  147–165, 2023.
  9. Id.8: Co-creating visual stories with generative ai. 2024. ISSN 2160-6455. doi: 10.1145/3672277. URL https://doi.org/10.1145/3672277.
  10. Refact: Updating text-to-image models by editing the text encoder, 2024. URL https://arxiv.org/abs/2306.00738.
  11. Vocabencounter: Nmt-powered vocabulary learning by presenting computer-generated usages of foreign words into users’ daily lives. New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450391573. doi: 10.1145/3491102.3501839. URL https://doi.org/10.1145/3491102.3501839.
  12. Catalyst: domain-extensible intervention for preventing task procrastination using large generative models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp.  1–19, 2023.
  13. Refusal in language models is mediated by a single direction, 2024. URL https://arxiv.org/abs/2406.11717.
  14. Explorations in human vs. generative ai creative performances: A study on human-ai creative potential. 2024.
  15. Redditbias: A real-world resource for bias evaluation and debiasing of conversational language models. arXiv preprint arXiv:2106.03521, 2021.
  16. Grounded copilot: How programmers interact with code-generating models. 7(OOPSLA1), 2023a. doi: 10.1145/3586030. URL https://doi.org/10.1145/3586030.
  17. Grounded copilot: How programmers interact with code-generating models. Proceedings of the ACM on Programming Languages, 7(OOPSLA1):85–111, 2023b.
  18. On mechanistic knowledge localization in text-to-image generative models. In Forty-first International Conference on Machine Learning, 2024a. URL https://openreview.net/forum?id=fsVBsxjRER.
  19. Localizing and editing knowledge in text-to-image generative models. In The Twelfth International Conference on Learning Representations, 2024b. URL https://openreview.net/forum?id=Qmw9ne6SOQ.
  20. Improving image generation with better captions. Computer Science. https://cdn. openai. com/papers/dall-e-3. pdf, 2(3):8, 2023.
  21. Elisabeth André (Augsburg University Germany) Birgit Endrass (Augsburg University, Germany) and Denmark) Matthias Rehm (Aalborg University. Towards culturally-aware virtual agent systems. Handbook of Research on Culturally-Aware Information Technology: Perspectives and Models, 2011.
  22. Audiolm: a language modeling approach to audio generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
  23. Towards monosemanticity: Decomposing language models with dictionary learning. Transformer Circuits Thread, 2023. https://transformer-circuits.pub/2023/monosemantic-features/index.html.
  24. Generative ai at work. Working Paper 31161, National Bureau of Economic Research, April 2023. URL http://www.nber.org/papers/w31161.
  25. Personalized font recommendations: Combining ml and typographic guidelines to optimize readability. In Proceedings of the 2022 ACM Designing Interactive Systems Conference, pp.  1–25, 2022.
  26. Scholargpt: Fine-tuning large language models for discipline-specific academic paper writing. 2024.
  27. Citesee: Augmenting citations in scientific papers with persistent and personalized historical context. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp.  1–15, 2023.
  28. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
  29. Lifelong machine learning. Springer Nature, 2022.
  30. Mark H Chignell. A taxonomy of user interface terminology. ACM SIGCHI Bulletin, 21(4):27, 1990.
  31. Sora as an agi world model? a complete survey on text-to-video generation. arXiv preprint arXiv:2403.05131, 2024.
  32. John Joon Young Chung and Eytan Adar. Promptpaint: Steering text-to-image generation through paint medium-like interactions. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp.  1–17, 2023.
  33. Talebrush: Sketching stories with generative pretrained language models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp.  1–19, 2022.
  34. Musicgen: Simple and controllable music generation. arXiv preprint arXiv:2301.11325, 2023. URL https://huggingface.co/facebook/musicgen-small.
  35. Sparse autoencoders find highly interpretable features in language models, 2023. URL https://arxiv.org/abs/2309.08600.
  36. Drawing apprentice: An enactive co-creative agent for artistic collaboration. In Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition, pp.  185–186, 2015.
  37. Creative sketching partner: A co-creative sketching tool to inspire design. In Proceedings of the 10th International Conference on Computational Creativity, pp.  358–359, 2019. doi: 10.1007/978-3-031-12807-3_11.
  38. Human-centered generative design framework: an early design framework to support concept creation and evaluation. International Journal of Human–Computer Interaction, 40(4):933–944, 2024.
  39. Jailbreaker: Automated jailbreak across multiple large language model chatbots. arXiv preprint arXiv:2307.08715, 2023.
  40. Manoj Deshpande. Towards co-build: An architecture machine for co-creative form-making. Master’s thesis, The University of North Carolina at Charlotte, 2020.
  41. Zijian Ding. Advancing gui for generative ai: Charting the design space of human-ai interactions through task creativity and complexity. In Companion Proceedings of the 29th International Conference on Intelligent User Interfaces, pp.  140–143, 2024.
  42. Deepscope: Hci platform for generative cityscape visualization. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp.  123–132, 2019. doi: 10.1145/3313831.3376722.
  43. Patient care through ai-driven remote monitoring: Analyzing the role of predictive models and intelligent alerts in preventive medicine. Journal of Contemporary Healthcare Analytics, 7(1):94–110, 2023.
  44. Multi-agent systems: A survey. IEEE Access, 6:28573–28593, 2018. doi: 10.1109/ACCESS.2018.2831228.
  45. Deepthink: Designing and probing human-ai co-creation in digital art therapy. International Journal of Human-Computer Studies, 181:103139, 2024.
  46. A mathematical framework for transformer circuits. Transformer Circuits Thread, 2021. https://transformer-circuits.pub/2021/framework/index.html.
  47. Advancements and challenges in ai integration for technical literacy: a systematic review. Engineering Science & Technology Journal, 5(4):1415–1430, 2024.
  48. Artificial intelligence for remote patient monitoring: Advancements, applications, and challenges. Kindle, 4(1):1–261, 2024.
  49. Emilio Ferrara. Genai against humanity: Nefarious applications of generative artificial intelligence and large language models. Journal of Computational Social Science, pp.  1–21, 2024.
  50. The robots are coming: Exploring the implications of openai codex on introductory programming. In Proceedings of the 24th Australasian Computing Education Conference, pp.  10–19, 2022.
  51. Tira Nur Fitria. Grammarly as ai-powered english writing assistant: Students’ alternative for writing english. Metathesis: Journal of English Language, Literature, and Teaching, 5(1):65–78, 2021.
  52. A challenger to gpt-4v? early explorations of gemini in visual expertise, 2023. URL https://arxiv.org/abs/2312.12436.
  53. Bias and fairness in large language models: A survey. Computational Linguistics, pp.  1–79, 2024.
  54. Unified concept editing in diffusion models, 2023. URL https://arxiv.org/abs/2308.14761.
  55. Assistgpt: A general multi-modal assistant that can plan, execute, inspect, and learn, 2023. URL https://arxiv.org/abs/2306.08640.
  56. Large-scale multi-agent-based modeling and simulation of microblogging-based online social network. In Multi-Agent-Based Simulation XIV: International Workshop, MABS 2013, Saint Paul, MN, USA, May 6-7, 2013, Revised Selected Papers, pp.  17–33. Springer, 2014.
  57. Sparks: Inspiration for science writing using language models. In Proceedings of the 2022 ACM Designing Interactive Systems Conference, pp.  1002–1019, 2022. doi: 10.1145/3532106.3533455.
  58. Dreamcodevr: Towards democratizing behavior design in virtual reality with speech-driven programming. In 2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR), pp.  579–589. IEEE, 2024.
  59. An autoethnographic case study of generative artificial intelligence in accessibility. ACM Digital Library, October 2023. URL https://dl.acm.org/doi/fullHtml/10.1145/3597638.3614548.
  60. Exploring challenges and opportunities to support designers in learning to co-create with ai-based manufacturing design tools. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp.  1–20, 2023.
  61. Lampost: Design and evaluation of an ai-assisted email writing prototype for adults with dyslexia. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility, pp.  1–18, 2022.
  62. Minotaur: Multi-task video grounding from multimodal queries. arXiv preprint arXiv:2302.08063, 2023.
  63. Healai: A healthcare llm for effective medical documentation. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, pp.  1167–1168, 2024.
  64. Deepbach: a steerable model for bach chorales generation. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), 2017. URL https://www.researchgate.net/publication/332141615_DEEPBACH_A_STEERABLE_MODEL_FOR_BACH_CHORALES_GENERATION.
  65. Forecasts of ai and future jobs in 2030: Muddling through likely, with two alternative scenarios. Journal of futures studies, 21(2), 2016.
  66. How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=p4PckNQR8k.
  67. Optimizing prompts for text-to-image generation. Advances in Neural Information Processing Systems, 36, 2024.
  68. Aging with grace: Lifelong model editing with discrete key-value adaptors. ArXiv, abs/2211.11031, 2022. URL https://api.semanticscholar.org/CorpusID:253735429.
  69. Usability and human–computer interaction (hci). In Sustainable design: HCI, usability and environmental concerns, pp.  23–40. Springer, 2022.
  70. IXDF. Human-ai interaction (hax), Apr 2024. URL https://www.interaction-design.org/literature/topics/human-ai-interaction.
  71. Chitra Iyer. How ai can help with digital workplace accessibility. Reworked, September 2023. URL https://www.reworked.co/digital-workplace/how-ai-can-help-with-digital-workplace-accessibility/.
  72. Co-writing with opinionated language models affects users’ views. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23. ACM, April 2023. doi: 10.1145/3544548.3581196. URL http://dx.doi.org/10.1145/3544548.3581196.
  73. Anna Jaruga-Rozdolska. Artificial intelligence as part of future practices in the architect’s work: Midjourney generative tool as part of a process of creating an architectural form. Architectus, (3 (71):95–104, 2022a.
  74. Anna Jaruga-Rozdolska. Artificial intelligence as part of future practices in the architect’s work: Midjourney generative tool as part of a process of creating an architectural form. Architectus, (3 (71):95–104, 2022b.
  75. Remote patient monitoring using artificial intelligence. In Artificial intelligence in healthcare, pp.  203–234. Elsevier, 2020.
  76. Fashionq: An ai-driven creativity support tool for facilitating ideation in fashion design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 2021. URL https://www.researchgate.net/publication/345324055_FashionQ_An_Interactive_Tool_for_Analyzing_Fashion_Style_Trend_with_Quantitative_Criteria.
  77. Wataa: Web alternative text authoring assistant for improving web content accessibility. In Companion proceedings of the 28th international conference on intelligent user interfaces, pp.  41–45, 2023.
  78. Graphologue: Exploring large language model responses with interactive diagrams. New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701320.
  79. Adaptifont: Increasing individuals’ reading speed with a generative font model and bayesian optimization. In Proceedings of the 2021 chi conference on human factors in computing systems, pp.  1–11, 2021.
  80. Designing an algorithm-driven text generation system for personalized and interactive news reading. International Journal of Human–Computer Interaction, 35(2):109–122, 2019.
  81. Colorbo: Envisioned mandala coloringthrough human-ai collaboration. New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450391443. URL https://doi.org/10.1145/3490099.3511135.
  82. Metaphorian: Leveraging large language models to support extended metaphor creation for science writing. In Proceedings of the 2023 ACM Designing Interactive Systems Conference, pp.  115–135, 2023a. doi: 10.1145/3563657.3595996.
  83. Lmcanvas: Object-oriented interaction to personalize large language model-powered writing environments. arXiv preprint arXiv:2303.15125, 2023b.
  84. Diarymate: Understanding user perceptions and experience in human-ai collaboration for personal journaling. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.  1–15, 2024.
  85. Vr-gpt: Visual language model for intelligent virtual reality applications. arXiv preprint arXiv:2405.11537, 2024.
  86. When is a tool a tool? user perceptions of system agency in human–ai co-creative drawing. In Proceedings of the 2023 ACM Designing Interactive Systems Conference, pp.  1978–1996, 2023.
  87. Codechain: Towards modular code generation through chain of self-revisions with representative sub-modules. arXiv preprint arXiv:2310.08992, 2023.
  88. Coauthor: Designing a human-ai collaborative writing dataset for exploring language model capabilities. In Proceedings of the 2022 CHI conference on human factors in computing systems, pp.  1–19, 2022.
  89. Bogen: Generating part-level 3d designs based on user intention inference through bayesian optimization and variational autoencoder. arXiv preprint arXiv:2312.02557, 2023.
  90. Motcoder: Elevating large language models with modular of thought for challenging programming tasks. arXiv preprint arXiv:2312.15960, 2023a.
  91. Inference-time intervention: Eliciting truthful answers from a language model, 2024. URL https://arxiv.org/abs/2306.03341.
  92. Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge. 2023b. URL https://arxiv.org/abs/2303.14070.
  93. A large-scale survey on the usability of ai programming assistants: Successes and challenges. In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, pp.  1–13, 2024a.
  94. Storydiffusion: How to support ux storyboarding with generative-ai. arXiv preprint arXiv:2407.07672, 2024b.
  95. David Chuan-En Lin and Nikolas Martelaro. Jigsaw: Supporting designers to prototype multimodal applications by chaining ai foundation models. New York, NY, USA, 2024. Association for Computing Machinery. ISBN 9798400703300. URL https://doi.org/10.1145/3613904.3641920.
  96. Beyond prompts: Exploring the design space of mixed-initiative co-creativity systems. arXiv preprint arXiv:2305.07465, 2023.
  97. Accessible conversational user interfaces: considerations for design. In Proceedings of the 17th international web for all conference, pp.  1–11, 2020.
  98. 3dall-e: Integrating text-to-image ai in 3d design workflows, 2023a. URL https://arxiv.org/abs/2210.11603.
  99. Tai: a tangible ai interface to enhance human-artificial intelligence (ai) communication beyond the screen. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems, pp.  281–285, 2016.
  100. Interngpt: Solving vision-centric tasks by interacting with chatgpt beyond language. arXiv preprint arXiv:2305.05662, 2023b. URL https://arxiv.org/abs/2305.05662.
  101. Ai assistance for ux: A literature review through human-centered ai. arXiv preprint arXiv:2402.06089, 2024.
  102. Prompt middleware: Mapping prompts for large language models to ui affordances. arXiv preprint arXiv:2307.01142, 2023.
  103. Convxai: a system for multimodal interaction with any black-box explainer. Cognitive Computation, 15(2):613–644, 2023.
  104. Generative ai misuse: A taxonomy of tactics and insights from real-world data. arXiv preprint arXiv:2406.13843, 2024.
  105. Directgpt: A direct manipulation interface to interact with large language models. arXiv preprint arXiv:2310.03691, 2023.
  106. Locating and editing factual associations in gpt, 2023a. URL https://arxiv.org/abs/2202.05262.
  107. Mass-editing memory in a transformer, 2023b. URL https://arxiv.org/abs/2210.07229.
  108. Parlai: A dialog research software platform. arXiv preprint arXiv:1705.06476, 2017.
  109. Point-e: A system for generating 3d point clouds from complex prompts. arXiv preprint arXiv:2212.08751, 2022.
  110. Envisioning the applications and implications of generative ai for news media. arXiv preprint arXiv:2402.18835, 2024.
  111. Donald A Norman. The psychology of everyday things. Basic books, 1988.
  112. Engagement in human-agent interaction: An overview. Frontiers in Robotics and AI, 7:92, 2020.
  113. Understanding user perception of automated news generation system. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp.  1–13, 2020.
  114. Askit: Unified programming interface for programming with large language models. In 2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp.  41–54. IEEE, 2024.
  115. Buncho: Ai supported story co-creation via unsupervised multitask learning to increase writers’ creativity in japanese. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021. URL https://digitalnature.slis.tsukuba.ac.jp/2021/05/buncho_chi2021/.
  116. Memgpt: Towards llms as operating systems, 2024. URL https://arxiv.org/abs/2310.08560.
  117. desainer: Exploring the use of “bad” generative adversarial networks in the ideation process of fashion design. In Proceedings of the 2021 Creativity and Cognition Conference, pp.  42:1–42:3, 2021. URL https://www.researchgate.net/publication/352662786_desAIner_Exploring_the_Use_of_Bad_Generative_Adversarial_Networks_in_the_Ideation_Process_of_Fashion_Design.
  118. Anglekindling: Supporting journalistic angle ideation with large language models. In Proceedings of the 2023 CHI conference on human factors in computing systems, pp.  1–16, 2023.
  119. Constitutionmaker: Interactively critiquing large language models by converting feedback into principles. In Proceedings of the 29th International Conference on Intelligent User Interfaces, pp.  853–868, 2024.
  120. The evaluation of accessibility, usability, and user experience. The universal access handbook, 1:1–16, 2009.
  121. Fine-tuning enhances existing mechanisms: A case study on entity tracking, 2024. URL https://arxiv.org/abs/2402.14811.
  122. “it’s weird that it knows what i want”: Usability and interactions with copilot for novice programmers. ACM Transactions on Computer-Human Interaction, 31(1):1–31, 2023.
  123. Creator: Tool creation for disentangling abstract and concrete reasoning of large language models. arXiv preprint arXiv:2305.14318, 2023.
  124. Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020, 2021. URL https://arxiv.org/abs/2103.00020.
  125. In-context retrieval-augmented language models. Transactions of the Association for Computational Linguistics, 11:1316–1331, 2023.
  126. Pixellm: Pixel reasoning with large multimodal model. arXiv preprint arXiv:2312.02228, 2023. URL https://arxiv.org/abs/2312.02228.
  127. Designing creative ai partners with cofi: A framework for modeling interaction in human-ai co-creative systems. ACM Transactions on Computer-Human Interaction, 30(5):1–28, 2023.
  128. The programmer’s assistant: Conversational interaction with a large language model for software development. 2023a. ISBN 9798400701061.
  129. The programmer’s assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces, pp.  491–514, 2023b.
  130. Automatic context-aware inference of engagement in hmi: A survey. IEEE Transactions on Affective Computing, 2023.
  131. Usability, user experience and accessibility: towards an integrative model. Ergonomics, 63(10):1207–1220, 2020.
  132. wav2vec: Unsupervised pre-training for speech recognition. arXiv preprint arXiv:1904.05862, 2019. URL https://arxiv.org/abs/1904.05862.
  133. The rise of the ai co-pilot: Lessons for design from aviation and beyond. Communications of the ACM, 67(6), Jun 2024. URL https://cacm.acm.org/opinion/the-rise-of-the-ai-co-pilot-lessons-for-design-from-aviation-and-beyond/.
  134. Eviza: A natural language interface for visual analysis. New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450341899. URL https://doi.org/10.1145/2984511.2984588.
  135. Parachute: Evaluating interactive human-lm co-writing systems. arXiv preprint arXiv:2303.06333, 2023.
  136. An hci-centric survey and taxonomy of human-generative-ai interactions. arXiv preprint arXiv:2310.07127, 2023.
  137. Effidit: Your ai writing assistant. arXiv preprint arXiv:2208.01815, 2022.
  138. Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188, 2022.
  139. Engagement when looking: behaviors for robots when collaborating with people. In Diabruck: Proceedings of the 7th workshop on the Semantic and Pragmatics of Dialogue, pp.  123–130. Citeseer, 2003.
  140. Ironies of generative ai: Understanding and mitigating productivity loss in human-ai interactions. arXiv preprint arXiv:2402.11364, 2024.
  141. Figura11y: Ai assistance for writing scientific alt text.
  142. Re-viewing reality: human factors of synthetic training environments. International Journal of Human-Computer Studies, 55(4):675–698, 2001.
  143. Improving instruction-following in language models through activation steering, 2024. URL https://arxiv.org/abs/2410.12877.
  144. Structured generation and exploration of design space with large language models for human-ai co-creation. arXiv preprint arXiv:2310.12953, 2023a.
  145. Sensecape: Enabling multilevel exploration and sensemaking with large language models. New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9798400701320. URL https://doi.org/10.1145/3586183.3606756.
  146. Iga: An intent-guided authoring assistant. arXiv preprint arXiv:2104.07000, 2021.
  147. Generative ai for designing and validating easily synthesizable and structurally novel antibiotics. Nature Machine Intelligence, 6:338–353, 2024. URL https://www.genengnews.com/topics/infectious-diseases/drug-resistant-bacteria-stymied-by-ai-designed-antibiotics/.
  148. Realfill: Reference-driven generation for authentic image completion. ACM Transactions on Graphics (TOG), 43(4):1–12, 2024.
  149. Ai alignment in the design of interactive ai: Specification alignment, process alignment, and evaluation support. arXiv preprint arXiv:2311.00710, 2023.
  150. Automatic generation of two-level hierarchical tutorials from instructional makeup videos. New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450380966. doi: 10.1145/3411764.3445721. URL https://doi.org/10.1145/3411764.3445721.
  151. Designscape: Design with interactive layout suggestions. In Proceedings of the 28th Annual ACM Symposium on User Interface Software and Technology, pp.  535–544, 2015. doi: 10.1145/2807442.2807451.
  152. “the less i type, the better”: How ai language models can enhance or impede communication for aac users. New York, NY, USA, 2023a. Association for Computing Machinery. ISBN 9781450394215. doi: 10.1145/3544548.3581560. URL https://doi.org/10.1145/3544548.3581560.
  153. “the less i type, the better”: How ai language models can enhance or impede communication for aac users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp.  1–14, 2023b.
  154. Generative ai as a tool for enhancing customer relationship management automation and personalization techniques. International Journal of Responsible Artificial Intelligence, 13(9):1–8, 2023.
  155. The dark side of generative artificial intelligence: A critical analysis of controversies and risks of chatgpt. Entrepreneurial Business and Economics Review, 11(2):7–30, 2023.
  156. Lave: Llm-powered agent assistance and language augmentation for video editing. New York, NY, USA, 2024a. Association for Computing Machinery. ISBN 9798400705083. URL https://doi.org/10.1145/3640543.3645143.
  157. Aesopagent: Agent-driven evolutionary system on story-to-video production. arXiv preprint arXiv:2403.07952, 2024b.
  158. Interpretability in the wild: a circuit for indirect object identification in gpt-2 small, 2022. URL https://arxiv.org/abs/2211.00593.
  159. Weaver: Foundation models for creative writing. arXiv preprint arXiv:2401.17268, 2024c.
  160. Promptcharm: Text-to-image generation through multi-modal prompting and refinement. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.  1–21, 2024d.
  161. Furu Wei et al. Language is not all you need: Aligning perception with language models. arXiv preprint arXiv:2302.14045, 2023a. URL https://arxiv.org/abs/2302.14045.
  162. Copiloting the copilots: Fusing large language models with completion engines for automated program repair. New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9798400703270. doi: 10.1145/3611643.3616271. URL https://doi.org/10.1145/3611643.3616271.
  163. Design principles for generative ai applications. arXiv preprint arXiv:2401.14484, 2024.
  164. Elicit: Ai literature review research assistant. Public Services Quarterly, 19(3):201–207, 2023.
  165. How knowledge workers think generative ai will (not) transform their industries. arXiv preprint arXiv:2310.06778, 2023.
  166. Next-gpt: Any-to-any multimodal llm, 2023. URL https://arxiv.org/abs/2309.05519.
  167. Promptchainer: Chaining large language model prompts through visual programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts, pp.  1–10, 2022.
  168. Xcreation: A graph-based crossmodal generative creativity support tool. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp.  1–15, 2023.
  169. Molprophet: A one-stop, general purpose, and ai-based platform for the early stages of drug discovery. Journal of Chemical Information and Modeling, 64(8):2941–2947, 2024.
  170. Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning. arXiv preprint arXiv:2302.04858, 2023.
  171. mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration. arXiv preprint arXiv:2311.04257, 2023. URL https://arxiv.org/abs/2311.04257.
  172. Ghostwriter: Augmenting collaborative human-ai writing experiences through personalization and agency. arXiv preprint arXiv:2402.08855, 2024.
  173. Coladder: Supporting programmers with hierarchical code generation in multi-level abstraction. arXiv preprint arXiv:2310.08699, 2023.
  174. Multimodal healthcare ai: identifying and designing clinically relevant vision-language applications for radiology. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.  1–22, 2024.
  175. Eric York. Evaluating chatgpt: Generative ai in ux design and web development pedagogy. In Proceedings of the 41st ACM International Conference on Design of Communication, pp.  197–201, 2023.
  176. Wordcraft: story writing with large language models. In Proceedings of the 27th International Conference on Intelligent User Interfaces, pp.  841–852, 2022.
  177. Disc-lawllm: Fine-tuning large language models for intelligent legal services. arXiv preprint arXiv:2309.11325, 2023.
  178. X 2-vlm: All-in-one pre-trained model for vision-language tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  179. Dialogpt: Large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536, 2019.
  180. Visar: A human-ai argumentative writing assistant with visual programming and rapid draft prototyping. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pp.  1–30, 2023.
  181. H2o: Heavy-hitter oracle for efficient generative inference of large language models. Advances in Neural Information Processing Systems, 36, 2024.
  182. Chatbridge: Bridging modalities with large language model as a language catalyst. 2023. URL https://arxiv.org/abs/2305.16103.
  183. Ernie-music: Text-to-waveform music generation with diffusion models. arXiv preprint arXiv:2302.04456, 2023.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 5 likes about this paper.