Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

In Situ AI Prototyping: Infusing Multimodal Prompts into Mobile Settings with MobileMaker (2405.03806v2)

Published 6 May 2024 in cs.HC

Abstract: Recent advances in multimodal LLMs have made it easier to rapidly prototype AI-powered features, especially for mobile use cases. However, gathering early, mobile-situated user feedback on these AI prototypes remains challenging. The broad scope and flexibility of LLMs means that, for a given use-case-specific prototype, there is a crucial need to understand the wide range of in-the-wild input users are likely to provide and their in-context expectations for the AI's behavior. To explore the concept of in situ AI prototyping and testing, we created MobileMaker: a platform that enables designers to rapidly create and test mobile AI prototypes directly on devices. This tool also enables testers to make on-device, in-the-field revisions of prototypes using natural language. In an exploratory study with 16 participants, we explored how user feedback on prototypes created with MobileMaker compares to that of existing prototyping tools (e.g., Figma, prompt editors). Our findings suggest that MobileMaker prototypes enabled more serendipitous discovery of: model input edge cases, discrepancies between AI's and user's in-context interpretation of the task, and contextual signals missed by the AI. Furthermore, we learned that while the ability to make in-the-wild revisions led users to feel more fulfilled as active participants in the design process, it might also constrain their feedback to the subset of changes perceived as more actionable or implementable by the prototyping tool.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. E. Jiang, K. Olson, E. Toh, A. Molina, A. Donsbach, M. Terry, and C. J. Cai, “PromptMaker: Prompt-based Prototyping with Large Language Models,” in Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI EA ’22.   New York, NY, USA: Association for Computing Machinery, Apr. 2022, pp. 1–8. [Online]. Available: https://dl.acm.org/doi/10.1145/3491101.3503564
  2. T. Wu, E. Jiang, A. Donsbach, J. Gray, A. Molina, M. Terry, and C. J. Cai, “PromptChainer: Chaining Large Language Model Prompts through Visual Programming,” in Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI EA ’22.   New York, NY, USA: Association for Computing Machinery, Apr. 2022, pp. 1–10. [Online]. Available: https://doi.org/10.1145/3491101.3519729
  3. T. Wu, M. Terry, and C. J. Cai, “AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts,” in CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, Apr. 2022, pp. 1–22. [Online]. Available: https://doi.org/10.1145/3491102.3517582
  4. M. X. Liu, T. Wu, T. Chen, F. M. Li, A. Kittur, and B. A. Myers, “Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models,” Oct. 2023. [Online]. Available: https://arxiv.org/abs/2310.02161
  5. Q. Yang, A. Scuito, J. Zimmerman, J. Forlizzi, and A. Steinfeld, “Investigating How Experienced UX Designers Effectively Work with Machine Learning,” in Proceedings of the 2018 Designing Interactive Systems Conference, ser. DIS ’18.   New York, NY, USA: Association for Computing Machinery, Jun. 2018, pp. 585–596. [Online]. Available: https://dl.acm.org/doi/10.1145/3196709.3196730
  6. Q. Yang, A. Steinfeld, C. Rosé, and J. Zimmerman, “Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design,” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, ser. CHI ’20.   New York, NY, USA: Association for Computing Machinery, Apr. 2020, pp. 1–13. [Online]. Available: https://dl.acm.org/doi/10.1145/3313831.3376301
  7. S. Petridis, M. Terry, and C. J. Cai, “PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers’ Workflows,” Oct. 2023, arXiv:2310.15435 [cs]. [Online]. Available: http://arxiv.org/abs/2310.15435
  8. ——, “PromptInfuser: Bringing User Interface Mock-ups to Life with Large Language Models,” in Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, ser. CHI EA ’23.   New York, NY, USA: Association for Computing Machinery, Apr. 2023, pp. 1–6. [Online]. Available: https://dl.acm.org/doi/10.1145/3544549.3585628
  9. M. X. Liu, F. Liu, A. J. Fiannaca, T. Koo, L. Dixon, M. Terry, and C. J. Cai, “”We Need Structured Output”: Towards User-centered Constraints on Large Language Model Output,” Apr. 2024, arXiv:2404.07362 [cs]. [Online]. Available: http://arxiv.org/abs/2404.07362
  10. OpenAI, “GPT-4 Technical Report,” Mar. 2023, arXiv:2303.08774 [cs]. [Online]. Available: http://arxiv.org/abs/2303.08774
  11. G. Team, R. Anil, S. Borgeaud, Y. Wu, J.-B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, and others, “Gemini: a family of highly capable multimodal models,” arXiv preprint arXiv:2312.11805, 2023.
  12. Y. Rogers, K. Connelly, L. Tedesco, W. Hazlewood, A. Kurtz, R. E. Hall, J. Hursey, and T. Toscos, “Why It’s Worth the Hassle: The Value of In-Situ Studies When Designing Ubicomp,” in UbiComp 2007: Ubiquitous Computing, J. Krumm, G. D. Abowd, A. Seneviratne, and T. Strang, Eds.   Berlin, Heidelberg: Springer, 2007, pp. 336–353.
  13. A. Crabtree, A. Chamberlain, R. E. Grinter, M. Jones, T. Rodden, and Y. Rogers, “Introduction to the Special Issue of “The Turn to The Wild”,” ACM Transactions on Computer-Human Interaction, vol. 20, no. 3, pp. 13:1–13:4, Jul. 2013. [Online]. Available: https://dl.acm.org/doi/10.1145/2491500.2491501
  14. C. Harrison, D. Tan, and D. Morris, “Skinput: appropriating the body as an input surface,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’10.   New York, NY, USA: Association for Computing Machinery, Apr. 2010, pp. 453–462. [Online]. Available: https://dl.acm.org/doi/10.1145/1753326.1753394
  15. S. Hudson, J. Fogarty, C. Atkeson, D. Avrahami, J. Forlizzi, S. Kiesler, J. Lee, and J. Yang, “Predicting human interruptibility with sensors: a Wizard of Oz feasibility study,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’03.   New York, NY, USA: Association for Computing Machinery, Apr. 2003, pp. 257–264. [Online]. Available: https://dl.acm.org/doi/10.1145/642611.642657
  16. P. Langley, “Machine learning for adaptive user interfaces,” in KI-97: Advances in Artificial Intelligence, G. Brewka, C. Habel, and B. Nebel, Eds.   Berlin, Heidelberg: Springer, 1997, pp. 53–62.
  17. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language Models are Few-Shot Learners,” Jul. 2020, arXiv:2005.14165 [cs]. [Online]. Available: http://arxiv.org/abs/2005.14165
  18. J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, K. Lenc, A. Mensch, K. Millican, M. Reynolds, R. Ring, E. Rutherford, S. Cabi, T. Han, Z. Gong, S. Samangooei, M. Monteiro, J. L. Menick, S. Borgeaud, A. Brock, A. Nematzadeh, S. Sharifzadeh, M. Bińkowski, R. Barreira, O. Vinyals, A. Zisserman, and K. Simonyan, “Flamingo: a Visual Language Model for Few-Shot Learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 23 716–23 736, Dec. 2022.
  19. M. Beaudouin-Lafon and W. E. Mackay, “Prototyping Tools and Techniques,” in The Human-Computer Interaction Handbook, 2nd ed.   CRC Press, 2007.
  20. Y.-K. Lim, E. Stolterman, and J. Tenenberg, “The anatomy of prototypes: Prototypes as filters, prototypes as manifestations of design ideas,” ACM Transactions on Computer-Human Interaction, vol. 15, no. 2, pp. 7:1–7:27, Jul. 2008. [Online]. Available: https://doi.org/10.1145/1375761.1375762
  21. R. Sefelin, M. Tscheligi, and V. Giller, “Paper prototyping - what is it good for? a comparison of paper- and computer-based low-fidelity prototyping,” in CHI ’03 Extended Abstracts on Human Factors in Computing Systems, ser. CHI EA ’03.   New York, NY, USA: Association for Computing Machinery, Apr. 2003, pp. 778–779. [Online]. Available: https://doi.org/10.1145/765891.765986
  22. M. De Sá and L. Carriço, “A mobile tool for in-situ prototyping,” in Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services.   Bonn Germany: ACM, Sep. 2009, pp. 1–4. [Online]. Available: https://dl.acm.org/doi/10.1145/1613858.1613884
  23. J. Rudd, K. Stern, and S. Isensee, “Low vs. high-fidelity prototyping debate,” Interactions, vol. 3, no. 1, pp. 76–85, Jan. 1996. [Online]. Available: https://dl.acm.org/doi/10.1145/223500.223514
  24. M. Walker, L. Takayama, and J. A. Landay, “High-Fidelity or Low-Fidelity, Paper or Computer? Choosing Attributes when Testing Web Prototypes,” Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 46, no. 5, pp. 661–665, Sep. 2002, publisher: SAGE Publications Inc. [Online]. Available: https://doi.org/10.1177/154193120204600513
  25. G. Dove, K. Halskov, J. Forlizzi, and J. Zimmerman, “UX Design Innovation: Challenges for Working with Machine Learning as a Design Material,” in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ser. CHI ’17.   New York, NY, USA: Association for Computing Machinery, May 2017, pp. 278–288. [Online]. Available: https://doi.org/10.1145/3025453.3025739
  26. Q. Yang, J. Suh, N.-C. Chen, and G. Ramos, “Grounding Interactive Machine Learning Tool Design in How Non-Experts Actually Build Models,” in Proceedings of the 2018 Designing Interactive Systems Conference, ser. DIS ’18.   New York, NY, USA: Association for Computing Machinery, Jun. 2018, pp. 573–584. [Online]. Available: https://doi.org/10.1145/3196709.3196729
  27. Q. Yang, N. Banovic, and J. Zimmerman, “Mapping Machine Learning Advances from HCI Research to Reveal Starting Places for Design Innovation,” in Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, ser. CHI ’18.   New York, NY, USA: Association for Computing Machinery, Apr. 2018, pp. 1–11. [Online]. Available: https://dl.acm.org/doi/10.1145/3173574.3173704
  28. M. X. Liu, A. Kittur, and B. A. Myers, “Crystalline: Lowering the Cost for Developers to Collect and Organize Information for Decision Making,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, 2022, event-place: New Orleans, LA, USA. [Online]. Available: https://doi.org/10.1145/3491102.3501968
  29. F. Girardin and N. Lathia, “When user experience designers partner with data scientists,” in 2017 AAAI Spring Symposium Series, 2017.
  30. Q. Yang, J. Cranshaw, S. Amershi, S. T. Iqbal, and J. Teevan, “Sketching NLP: A Case Study of Exploring the Right Things To Design with Language Intelligence,” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, ser. CHI ’19.   New York, NY, USA: Association for Computing Machinery, May 2019, pp. 1–12. [Online]. Available: https://doi.org/10.1145/3290605.3300415
  31. C. Kayacik, S. Chen, S. Noerly, J. Holbrook, A. Roberts, and D. Eck, “Identifying the Intersections: User Experience + Research Scientist Collaboration in a Generative Machine Learning Interface,” in Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, ser. CHI EA ’19.   New York, NY, USA: Association for Computing Machinery, May 2019, pp. 1–8. [Online]. Available: https://dl.acm.org/doi/10.1145/3290607.3299059
  32. S. Petridis, B. Wedin, J. Wexler, A. Donsbach, M. Pushkarna, N. Goyal, C. J. Cai, and M. Terry, “ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into Principles,” Oct. 2023, arXiv:2310.15428 [cs]. [Online]. Available: http://arxiv.org/abs/2310.15428
  33. M. X. Liu, A. Sarkar, C. Negreanu, B. Zorn, J. Williams, N. Toronto, and A. D. Gordon, ““What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, ser. CHI ’23.   New York, NY, USA: Association for Computing Machinery, Apr. 2023, pp. 1–31. [Online]. Available: https://dl.acm.org/doi/10.1145/3544548.3580817
  34. J. J. Y. Chung, W. Kim, K. M. Yoo, H. Lee, E. Adar, and M. Chang, “TaleBrush: Sketching Stories with Generative Pretrained Language Models,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, Apr. 2022, pp. 1–19. [Online]. Available: https://doi.org/10.1145/3491102.3501819
  35. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike, and R. Lowe, “Training language models to follow instructions with human feedback,” Mar. 2022, arXiv:2203.02155 [cs]. [Online]. Available: http://arxiv.org/abs/2203.02155
  36. M. Kahng, I. Tenney, M. Pushkarna, M. X. Liu, J. Wexler, E. Reif, K. Kallarackal, M. Chang, M. Terry, and L. Dixon, “LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models,” Feb. 2024, arXiv:2402.10524 [cs]. [Online]. Available: http://arxiv.org/abs/2402.10524
  37. S. Petridis, B. Wedin, A. Yuan, J. Wexler, and N. Thain, “ConstitutionalExperts: Training a Mixture of Principle-based Prompts,” Mar. 2024, arXiv:2403.04894 [cs]. [Online]. Available: http://arxiv.org/abs/2403.04894
  38. H. Subramonyam, C. Seifert, and E. Adar, “ProtoAI: Model-Informed Prototyping for AI-Powered Interfaces,” in Proceedings of the 26th International Conference on Intelligent User Interfaces, ser. IUI ’21.   New York, NY, USA: Association for Computing Machinery, Apr. 2021, pp. 48–58. [Online]. Available: https://doi.org/10.1145/3397481.3450640
  39. K. J. K. Feng, Q. V. Liao, Z. Xiao, J. W. Vaughan, A. X. Zhang, and D. W. McDonald, “Canvil: Designerly Adaptation for LLM-Powered User Experiences,” Jan. 2024. [Online]. Available: http://arxiv.org/abs/2401.09051
  40. C. Boothe, L. Strawderman, and E. Hosea, “The effects of prototype medium on usability testing,” Applied Ergonomics, vol. 44, no. 6, pp. 1033–1038, Nov. 2013. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0003687013000860
  41. Figma, “Figma: The Collaborative Interface Design Tool,” 2024. [Online]. Available: https://www.figma.com/
  42. Balsamiq, “Balsamiq: Fast, focused wireframing for teams and individuals | Balsamiq,” 2024.
  43. Sketch, “Sketch,” 2024. [Online]. Available: https://www.sketch.com/
  44. K. J. K. Feng and D. W. Mcdonald, “Addressing UX Practitioners’ Challenges in Designing ML Applications: an Interactive Machine Learning Approach,” in Proceedings of the 28th International Conference on Intelligent User Interfaces, ser. IUI ’23.   New York, NY, USA: Association for Computing Machinery, Mar. 2023, pp. 337–352. [Online]. Available: https://dl.acm.org/doi/10.1145/3581641.3584064
  45. Q. Yang, “Machine Learning as a UX Design Material: How Can We Imagine Beyond Automation, Recommenders, and Reminders?” AAAI Spring Symposia, vol. 1, no. 2.1, pp. 2–6, Mar. 2018.
  46. D. Maulsby, S. Greenberg, and R. Mander, “Prototyping an intelligent agent through Wizard of Oz,” in Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing Systems, ser. CHI ’93.   New York, NY, USA: Association for Computing Machinery, May 1993, pp. 277–284. [Online]. Available: https://dl.acm.org/doi/10.1145/169059.169215
  47. J. Cranshaw, E. Elwany, T. Newman, R. Kocielnik, B. Yu, S. Soni, J. Teevan, and A. Monroy-Hernández, “Calendar.Help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop,” in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ser. CHI ’17.   New York, NY, USA: ACM, 2017, pp. 2382–2393. [Online]. Available: http://doi.acm.org/10.1145/3025453.3025780
  48. S. R. Klemmer, A. K. Sinha, J. Chen, J. A. Landay, N. Aboobaker, and A. Wang, “Suede: a Wizard of Oz prototyping tool for speech user interfaces,” in Proceedings of the 13th annual ACM symposium on User interface software and technology, ser. UIST ’00.   New York, NY, USA: Association for Computing Machinery, Nov. 2000, pp. 1–10. [Online]. Available: https://dl.acm.org/doi/10.1145/354401.354406
  49. L. D. Riek, “Wizard of Oz studies in HRI: a systematic review and new reporting guidelines,” Journal of Human-Robot Interaction, vol. 1, no. 1, pp. 119–136, Jul. 2012. [Online]. Available: https://dl.acm.org/doi/10.5898/JHRI.1.1.Riek
  50. C. Parnin, G. Soares, R. Pandita, S. Gulwani, J. Rich, and A. Z. Henley, “Building Your Own Product Copilot: Challenges, Opportunities, and Needs,” Dec. 2023, arXiv:2312.14231 [cs]. [Online]. Available: http://arxiv.org/abs/2312.14231
  51. C. Kulkarni, S. Druga, M. Chang, A. Fiannaca, C. Cai, and M. Terry, “A Word is Worth a Thousand Pictures: Prompts as AI Design Material,” Mar. 2023, arXiv:2303.12647 [cs]. [Online]. Available: http://arxiv.org/abs/2303.12647
  52. J. C. d. A. Nogueira, A. S. Gomes, A. S. d. C. Filho, and F. Moreira, “Effectiveness of embodied evaluation of mobile applications: A qualitative study,” Heliyon, vol. 9, no. 6, p. e17043, Jun. 2023.
  53. D. Dzvonyar, S. Krusche, R. Alkadhi, and B. Bruegge, “Context-Aware User Feedback in Continuous Software Evolution,” in 2016 IEEE/ACM International Workshop on Continuous Software Evolution and Delivery (CSED), May 2016, pp. 12–18.
  54. W. Buxton and R. Sniderman, “Iteration in the design of the human-computer interface,” in Proc. of the 13th annual meeting, Human Factors Association of Canada, 1980, pp. 72–81.
  55. S. Dow, B. MacIntyre, J. Lee, C. Oezbek, J. Bolter, and M. Gandy, “Wizard of Oz support throughout an iterative design process,” IEEE Pervasive Computing, vol. 4, no. 4, pp. 18–26, 2005.
  56. J. Nielsen, “Iterative Design of User Interfaces,” 1993. [Online]. Available: https://www.nngroup.com/articles/iterative-design/
  57. T. Brown and B. Katz, “Change by design,” Journal of product innovation management, vol. 28, no. 3, pp. 381–383, 2011, publisher: Wiley Online Library.
  58. “Overview of multimodal models,” publication Title: Google Cloud. [Online]. Available: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview
  59. “Imagen on Vertex AI AI Image Generator,” publication Title: Google Cloud. [Online]. Available: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
  60. S. P. Dow, A. Glassco, J. Kass, M. Schwarz, D. L. Schwartz, and S. R. Klemmer, “Parallel prototyping leads to better design results, more divergence, and increased self-efficacy,” ACM Transactions on Computer-Human Interaction (TOCHI), vol. 17, no. 4, pp. 1–24, 2010, publisher: ACM New York, NY, USA.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Savvas Petridis (9 papers)
  2. Michael Xieyang Liu (16 papers)
  3. Alexander J. Fiannaca (3 papers)
  4. Vivian Tsai (3 papers)
  5. Michael Terry (25 papers)
  6. Carrie J. Cai (14 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets