Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cocobo: Exploring Large Language Models as the Engine for End-User Robot Programming (2407.20712v1)

Published 30 Jul 2024 in cs.HC and cs.AI

Abstract: End-user development allows everyday users to tailor service robots or applications to their needs. One user-friendly approach is natural language programming. However, it encounters challenges such as an expansive user expression space and limited support for debugging and editing, which restrict its application in end-user programming. The emergence of LLMs offers promising avenues for the translation and interpretation between human language instructions and the code executed by robots, but their application in end-user programming systems requires further study. We introduce Cocobo, a natural language programming system with interactive diagrams powered by LLMs. Cocobo employs LLMs to understand users' authoring intentions, generate and explain robot programs, and facilitate the conversion between executable code and flowchart representations. Our user study shows that Cocobo has a low learning curve, enabling even users with zero coding experience to customize robot programs successfully.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. H. Lieberman, F. Paternò, M. Klann, and V. Wulf, “End-user development: An emerging paradigm,” in End User Development, ser. Human-Computer Interaction Series, H. Lieberman, F. Paternò, and V. Wulf, Eds.   Dordrecht: Springer Netherlands, 2006, pp. 1–8.
  2. S. Dragule, T. Berger, C. Menghi, and P. Pelliccione, “A survey on the design space of end-user-oriented languages for specifying robotic missions,” Software and Systems Modeling, vol. 20, no. 4, pp. 1123–1158, 2021.
  3. F. Paternò and C. Santoro, “End-user development for personalizing applications, things, and robots,” International Journal of Human-Computer Studies, vol. 131, pp. 120–130, 2019.
  4. G. Desolda, C. Ardito, and M. Matera, “Empowering end users to customize their smart environments: Model, composition paradigms, and domain-specific tools,” ACM Transactions on Computer-Human Interaction, vol. 24, no. 2, pp. 1–52, 2017.
  5. G. Ajaykumar, M. Steele, and C.-M. Huang, “A survey on end-user robot programming,” ACM Computing Surveys, vol. 54, no. 8, pp. 1–36, 2022.
  6. S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg, H. Nori, H. Palangi, M. T. Ribeiro, and Y. Zhang, “Sparks of artificial general intelligence: Early experiments with gpt-4,” 2023.
  7. R. Zhang, J. Han, C. Liu, P. Gao, A. Zhou, X. Hu, S. Yan, P. Lu, H. Li, and Y. Qiao, “Llama-adapter: Efficient fine-tuning of language models with zero-init attention,” 2023.
  8. J. Wei, Y. Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler, E. H. Chi, T. Hashimoto, O. Vinyals, P. Liang, J. Dean, and W. Fedus, “Emergent abilities of large language models,” 2022.
  9. Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, and Z. Sui, “A survey on in-context learning,” 2023.
  10. N. Leonardi, M. Manca, F. Paternò, and C. Santoro, “Trigger-action programming for personalising humanoid robot behaviour,” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19.   Glasgow, Scotland Uk: ACM Press, 2019, pp. 1–13.
  11. F. Erich, M. Hirokawa, and K. Suzuki, “A visual environment for reactive robot programming of macro-level behaviors,” in Social Robotics, ser. Lecture Notes in Computer Science, A. Kheddar, E. Yoshida, S. S. Ge, K. Suzuki, J.-J. Cabibihan, F. Eyssel, and H. He, Eds.   Cham: Springer International Publishing, 2017, pp. 577–586.
  12. E. Pot, J. Monceaux, R. Gelin, and B. Maisonnier, “Choregraphe: a graphical tool for humanoid robot programming,” in RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.   Toyama, Japan: IEEE, 2009, pp. 46–51.
  13. C. Paxton, F. Jonathan, A. Hundt, B. Mutlu, and G. D. Hager, “Evaluating methods for end-user creation of robot task plans,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   Madrid: IEEE, 2018, pp. 6086–6092.
  14. C. Datta, C. Jayawardena, I. H. Kuo, and B. A. MacDonald, “Robostudio: A visual programming environment for rapid authoring and customization of complex services on a personal service robot,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.   Vilamoura-Algarve, Portugal: IEEE, 2012, pp. 2352–2357.
  15. D. Porfirio, L. Stegner, M. Cakmak, A. Sauppé, A. Albarghouthi, and B. Mutlu, “Sketching robot programs on the fly,” in Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, ser. HRI ’23.   New York, NY, USA: Association for Computing Machinery, 2023, pp. 584–593.
  16. N. Buchina, S. Kamel, and E. Barakova, “Design and evaluation of an end-user friendly tool for robot programming,” in 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).   New York, NY, USA: IEEE, 2016, pp. 185–191.
  17. A. Monge Roffarello and L. De Russis, “Correction to: Defining trigger-action rules via voice: A novel approach for end-user development in the iot,” in End-User Development, ser. Lecture Notes in Computer Science, L. D. Spano, A. Schmidt, C. Santoro, and S. Stumpf, Eds.   Cham: Springer Nature Switzerland, 2023, pp. C1–C1.
  18. J. F. Gorostiza and M. A. Salichs, “End-user programming of a social robot by dialog,” Robotics and Autonomous Systems, vol. 59, no. 12, pp. 1102–1114, 2011.
  19. J. Saunders, D. S. Syrdal, K. L. Koay, N. Burke, and K. Dautenhahn, ““teach me–show me”—end-user personalization of a smart home and companion robot,” IEEE Transactions on Human-Machine Systems, vol. 46, no. 1, pp. 27–40, 2016.
  20. E. Coronado, F. Mastrogiovanni, B. Indurkhya, and G. Venture, “Visual programming environments for end-user development of intelligent and social robots, a systematic review,” Journal of Computer Languages, vol. 58, p. 100970, 2020.
  21. G. Giannopoulou, E.-M. Borrelli, and F. McMaster, “”programming - it’s not for normal people”: A qualitative study on user-empowering interfaces for programming collaborative robots,” in 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN).   Vancouver, BC, Canada: IEEE, 2021, pp. 37–44.
  22. Y. Oishi, T. Kanda, M. Kanbara, S. Satake, and N. Hagita, “Toward end-user programming for robots in stores,” in Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, ser. HRI ’17.   New York, NY, USA: Association for Computing Machinery, 2017, pp. 233–234.
  23. G. Fischer, “Adaptive and adaptable systems: Differentiating and integrating ai and eud,” in End-User Development, ser. Lecture Notes in Computer Science, L. D. Spano, A. Schmidt, C. Santoro, and S. Stumpf, Eds.   Cham: Springer Nature Switzerland, 2023, pp. 3–18.
  24. A. Sarkar, A. D. Gordon, C. Negreanu, C. Poelitz, S. S. Ragavan, and B. Zorn, “What is it like to program with artificial intelligence?” 2022.
  25. S. Srinivasa Ragavan, Z. Hou, Y. Wang, A. D. Gordon, H. Zhang, and D. Zhang, “Gridbook: Natural language formulas for the spreadsheet grid,” in 27th International Conference on Intelligent User Interfaces, ser. IUI ’22.   New York, NY, USA: Association for Computing Machinery, 2022, pp. 345–368.
  26. J. Zamfirescu-Pereira, R. Y. Wong, B. Hartmann, and Q. Yang, “Why johnny can’t prompt: How non-ai experts try (and fail) to design llm prompts,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, ser. CHI ’23.   New York, NY, USA: Association for Computing Machinery, 2023, pp. 1–21.
  27. M. X. Liu, A. Sarkar, C. Negreanu, B. Zorn, J. Williams, N. Toronto, and A. D. Gordon, ““what it wants me to say”: Bridging the abstraction gap between end-user programmers and code-generating large language models,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, ser. CHI ’23.   New York, NY, USA: Association for Computing Machinery, 2023, pp. 1–31.
  28. E. Jiang, E. Toh, A. Molina, K. Olson, C. Kayacik, A. Donsbach, C. J. Cai, and M. Terry, “Discovering the syntax and strategies of natural language programming with generative language models,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, 2022, pp. 1–19.
  29. P. Jiang, J. Rayan, S. P. Dow, and H. Xia, “Graphologue: Exploring large language model responses with interactive diagrams,” 2023.
  30. A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in end-user programming systems,” in 2004 IEEE Symposium on Visual Languages - Human Centric Computing.   Rome, Italy: IEEE, 2004, pp. 199–206.
  31. J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” 2023.
  32. T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” 2023.
  33. S. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, “Chatgpt for robotics: Design principles and model abilities,” 2023.
  34. U. B. Karli, J.-T. Chen, V. N. Antony, and C.-M. Huang, “Alchemist: Llm-aided end-user development of robot applications,” in Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, ser. HRI ’24.   New York, NY, USA: Association for Computing Machinery, 2024, pp. 361–370.
  35. S. Ainsworth, “The functions of multiple representations,” Computers & Education, vol. 33, no. 2-3, pp. 131–152, Sep. 1999.
  36. T. Wu, M. Terry, and C. J. Cai, “Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, ser. CHI ’22.   New York, NY, USA: Association for Computing Machinery, 2022.
  37. J. Kim, S. Suh, L. B. Chilton, and H. Xia, “Metaphorian: Leveraging large language models to support extended metaphor creation for science writing,” in Proceedings of the 2023 ACM Designing Interactive Systems Conference, ser. DIS ’23.   New York, NY, USA: Association for Computing Machinery, 2023, pp. 115–135.
  38. F. F. Xu, B. Vasilescu, and G. Neubig, “In-ide code generation from natural language: Promise and challenges,” ACM Transactions on Software Engineering and Methodology, vol. 31, no. 2, pp. 29:1–29:47, 2022.
  39. S. I. Ross, F. Martinez, S. Houde, M. Muller, and J. D. Weisz, “The programmer’s assistant: Conversational interaction with a large language model for software development,” in Proceedings of the 28th International Conference on Intelligent User Interfaces.   Sydney NSW Australia: ACM, 2023, pp. 491–514.
  40. “robotemi/sdk.” [Online]. Available: https://github.com/robotemi/sdk
  41. “JSON.” [Online]. Available: https://www.json.org/json-en.html
  42. K. Sveidqvist and Contributors to Mermaid, “Mermaid: Generate diagrams from markdown-like text.” [Online]. Available: https://github.com/mermaid-js/mermaid
  43. “WebSockets handbook.” [Online]. Available: https://websocket.org/
  44. “X6 JavaScript Diagramming Library.” [Online]. Available: https://x6.antv.vision/en
  45. B. Aaron, “Determining what individual sus scores mean: Adding an adjective rating scale,” Journal of usability studies, vol. 4, p. 3, 2009.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yate Ge (4 papers)
  2. Yi Dai (20 papers)
  3. Run Shan (1 paper)
  4. Kechun Li (1 paper)
  5. Yuanda Hu (4 papers)
  6. Xiaohua Sun (5 papers)