Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models Meet User Interfaces: The Case of Provisioning Feedback (2404.11072v1)

Published 17 Apr 2024 in cs.HC and cs.AI
Large Language Models Meet User Interfaces: The Case of Provisioning Feedback

Abstract: Incorporating Generative AI (GenAI) and LLMs in education can enhance teaching efficiency and enrich student learning. Current LLM usage involves conversational user interfaces (CUIs) for tasks like generating materials or providing feedback. However, this presents challenges including the need for educator expertise in AI and CUIs, ethical concerns with high-stakes decisions, and privacy risks. CUIs also struggle with complex tasks. To address these, we propose transitioning from CUIs to user-friendly applications leveraging LLMs via API calls. We present a framework for ethically incorporating GenAI into educational tools and demonstrate its application in our tool, Feedback Copilot, which provides personalized feedback on student assignments. Our evaluation shows the effectiveness of this approach, with implications for GenAI researchers, educators, and technologists. This work charts a course for the future of GenAI in education.

LLMs Meet User Interfaces: The Case of Provisional Feedback

Introduction to the Paper

The paper details an exploration into the use of Generative Artificial Intelligence (GenAI) applications in educational environments, specifically focusing on how LLMs can be integrated within user interfaces to enhance educational tasks. It introduces a novel framework for the design and application of these technologies as standalone, user-centered systems rather than relying purely on conversational user interfaces (CUIs). The paper bases its observations on previous challenges found in standard CUI applications of GenAI, proposing a two-component framework to refine educational technology design and interaction. A practical application, the "GenAI Feedback Provisioning Copilot," is developed utilizing this framework, aimed at aiding instructors in generating high-quality personalized feedback for student assignments.

Key Challenges and Proposed Solutions

The paper identifies several enduring challenges in the existing deployment of LLMs through CUIs, including:

  • AI literacy barriers necessitating significant prompt engineering skills from users.
  • Limited user autonomy and guidance, complicating user engagement with open-ended interfaces.
  • Restricted access to institutional data, raising privacy and intellectual property concerns.
  • The difficulty in managing batch tasks effectively due to the conversational nature of most GenAI applications.

To tackle these challenges, the paper introduces a structured framework emphasizing:

  1. Designing Core Educational Tools: Selection of educational tasks, integration of pedagogical theories, and implementation criteria for evaluating GenAI outputs and data.
  2. Designing User Interactions: Developing user interfaces and workflows, handling prompt generation, and facilitating content review to ensure accuracy and pedagogical soundness.

Implementation and Insights from the Feedback Copilot

The "GenAI Feedback Provisioning Copilot" serves as a reference implementation for the proposed framework. It aims to assist educators in crafting personalized feedback for students' graded assignments. This tool generates feedback based on a set of predetermined standards for quality and provides the option for educators to review cases where the feedback may not meet quality standards.

The paper employs the Feedback Copilot in an experimental setup involving 338 students' assignments. Findings suggest that the advanced version of the tool, which includes additional prompting and guidance, produces better-quality feedback compared to its base version. Moreover, feedback quality was found to correlate with student performance; lower-performing students received feedback of lower quality, indicating an essential area for closer educator involvement.

Theoretical and Practical Implications

The paper contributes significantly to the discourse on integrating AI in education by addressing both practical and theoretical aspects. Practically, it provides a blueprint for constructing GenAI applications that are more aligned with educational frameworks and sensitive to the challenges educators face. Theoretically, it pushes forward the conversation on how LLMs can be systematically evaluated within educational settings to ensure they not only adhere to pedagogical standards but also genuinely augment the teaching and learning experience.

Future Directions in AI and Education

Looking forward, the paper speculates on several advancements:

  • Improved interfaces for GenAI applications that offer more intuitive and responsive user guidance.
  • Enhanced integration of institutional data within GenAI applications while managing privacy and ethical considerations more robustly.
  • Deeper explorations into tailored GenAI applications across different educational contexts and subjects.

This comprehensive paper enriches our understanding of the intersection between LLMs and user interfaces within an educational context and sets the stage for subsequent innovations and refinements in this promising field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (84)
  1. A Model for Integrating Generative AI into Course Content Development, August 2023. arXiv: 2308.12276 [cs] Issue: arXiv:2308.12276.
  2. Can We Trust AI-Generated Educational Content? Comparative Analysis of Human and AI-Generated Learning Resources, July 2023a. arXiv: 2306.10509 [cs] Issue: arXiv:2306.10509.
  3. Scalable Educational Question Generation with Pre-trained Language Models, May 2023. URL http://arxiv.org/abs/2305.07871. arXiv:2305.07871 [cs].
  4. Assessing the Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based Methods. In Olga Viberg, Ioana Jivet, Pedro J. Muñoz-Merino, Maria Perifanou, and Tina Papathoma, editors, Responsive and Sustainable Educational Futures, volume 14200, pages 229–245. Springer Nature Switzerland, Cham, 2023. ISBN 978-3-031-42681-0 978-3-031-42682-7. doi:10.1007/978-3-031-42682-7_16. URL https://link.springer.com/10.1007/978-3-031-42682-7_16. Series Title: Lecture Notes in Computer Science.
  5. Generative AI for Learning: Investigating the Potential of Learning Videos with Synthetic Virtual Instructors. In Ning Wang, Genaro Rebolledo-Mendez, Vania Dimitrova, Noboru Matsuda, and Olga C. Santos, editors, Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky, volume 1831, pages 523–529. Springer Nature Switzerland, Cham, 2023. ISBN 978-3-031-36335-1 978-3-031-36336-8. doi:10.1007/978-3-031-36336-8_81. URL https://link.springer.com/10.1007/978-3-031-36336-8_81. Series Title: Communications in Computer and Information Science.
  6. Learning gain differences between ChatGPT and human tutor generated algebra hints, February 2023. URL http://arxiv.org/abs/2302.06871. arXiv:2302.06871 [cs].
  7. Evaluating llm-generated worked examples in an introductory programming course. In Proceedings of the 26th Australasian Computing Education Conference, ACE ’24, page 77–86, New York, NY, USA, 2024. Association for Computing Machinery. ISBN 9798400716195. doi:10.1145/3636243.3636252. URL https://doi.org/10.1145/3636243.3636252.
  8. Learnersourcing in the age of AI: Student, educator and machine partnerships for content creation. Computers and Education: Artificial Intelligence, 5:100151, 2023. ISSN 2666920X. doi:10.1016/j.caeai.2023.100151.
  9. Using Large Language Models to Provide Explanatory Feedback to Human Tutors.
  10. FABRIC: Automated Scoring and Feedback Generation for Essays, October 2023. URL http://arxiv.org/abs/2310.05191. arXiv:2310.05191 [cs].
  11. Are LLMs Useful in the Poorest Schools? theTeacherAI in Sierra Leone, October 2023. URL http://arxiv.org/abs/2310.02982. arXiv:2310.02982 [cs].
  12. What is AI Literacy? Competencies and Design Considerations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–16, Honolulu HI USA, April 2020. ACM. ISBN 978-1-4503-6708-0. doi:10.1145/3313831.3376727.
  13. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT, February 2023. arXiv: 2302.11382 [cs] Issue: arXiv:2302.11382.
  14. William Cain. Prompting Change: Exploring Prompt Engineering in Large Language Model AI and Its Potential to Transform Education. TechTrends : for leaders in education & training, October 2023. ISSN 8756-3894, 1559-7075. doi:10.1007/s11528-023-00896-0.
  15. Conversing with copilot: Exploring prompt engineering for solving cs1 problems using natural language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, SIGCSE 2023, page 1136–1142, New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9781450394314. doi:10.1145/3545945.3569823. URL https://doi.org/10.1145/3545945.3569823.
  16. Generative AI and the future of education: Ragnarök or reformation? A paradoxical perspective from management educators. The International Journal of Management Education, 21(2):100790, July 2023. ISSN 1472-8117. doi:10.1016/j.ijme.2023.100790. URL https://www.sciencedirect.com/science/article/pii/S1472811723000289.
  17. Assigning AI: Seven Approaches for Students, with Prompts, June 2023. arXiv: 2306.10052 [cs] Issue: arXiv:2306.10052.
  18. ChainForge: An open-source visual programming environment for prompt engineering. In Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–3, San Francisco CA USA, October 2023. ACM. ISBN 9798400700965. doi:10.1145/3586182.3616660. URL https://dl.acm.org/doi/10.1145/3586182.3616660.
  19. Sensecape: Enabling multilevel exploration and sensemaking with large language models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, UIST ’23, pages 1–18, New York, NY, USA, 2023a. Association for Computing Machinery. ISBN 9798400701320. doi:10.1145/3586183.3606756. URL https://doi.org/10.1145/3586183.3606756.
  20. Spellburst: A Node-based Interface for Exploratory Creative Coding with Natural Language Prompts, August 2023. arXiv: 2308.03921 [cs].
  21. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274, April 2023. ISSN 10416080. doi:10.1016/j.lindif.2023.102274.
  22. Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review. British Journal of Educational Technology, page bjet.13370, August 2023. ISSN 0007-1013, 1467-8535. doi:10.1111/bjet.13370. URL http://arxiv.org/abs/2303.13379. arXiv:2303.13379 [cs].
  23. A Survey on Evaluation of Large Language Models. ACM Transactions on Intelligent Systems and Technology, page 3641289, January 2024. ISSN 2157-6904, 2157-6912. doi:10.1145/3641289.
  24. A Meta Systematic Review of Artificial Intelligence in Higher Education: A call for increased ethics, collaboration, and rigour. International Journal of Educational Technology in Higher Education, 2023. doi:10.13140/RG.2.2.31921.56162/1.
  25. Learning Analytics in the Era of Large Language Models. Preprint, Social Sciences, August 2023.
  26. Computing education in the era of generative ai. Commun. ACM, 67(2):56–67, jan 2024. ISSN 0001-0782. doi:10.1145/3624720. URL https://doi.org/10.1145/3624720.
  27. A Transformer-Based Approach for the Automatic Generation of Concept-Wise Exercises to Provide Personalized Learning Support to Students. In Olga Viberg, Ioana Jivet, Pedro J. Muñoz-Merino, Maria Perifanou, and Tina Papathoma, editors, Responsive and Sustainable Educational Futures, volume 14200, pages 16–31. Springer Nature Switzerland, Cham, 2023. ISBN 978-3-031-42681-0 978-3-031-42682-7. doi:10.1007/978-3-031-42682-7_2. URL https://link.springer.com/10.1007/978-3-031-42682-7_2. Series Title: Lecture Notes in Computer Science.
  28. The Power of Feedback. Review of Educational Research, 77(1):81–112, March 2007. ISSN 0034-6543, 1935-1046. doi:10.3102/003465430298487.
  29. The Power of Feedback Revisited: A Meta-Analysis of Educational Feedback Research. Frontiers in Psychology, 10, 2020. ISSN 1664-1078.
  30. A Systematic Literature Review of Automated Feedback Generation for Programming Exercises. ACM Transactions on Computing Education, 19(1):1–43, March 2019. ISSN 1946-6226. doi:10.1145/3231711.
  31. Automatic feedback in online learning environments: A systematic literature review. Computers and Education: Artificial Intelligence, 2:100027, 2021. ISSN 2666920X. doi:10.1016/j.caeai.2021.100027.
  32. A review of automated feedback systems for learners: Classification framework, challenges and opportunities. Computers & Education, 162:104094, March 2021. ISSN 03601315. doi:10.1016/j.compedu.2020.104094.
  33. Personalized feedback in digital learning environments: Classification framework and literature review. Computers and Education: Artificial Intelligence, 3:100080, 2022. ISSN 2666920X. doi:10.1016/j.caeai.2022.100080. URL https://linkinghub.elsevier.com/retrieve/pii/S2666920X22000352.
  34. Machine learning based feedback on textual student answers in large courses. Computers and Education: Artificial Intelligence, 3:100081, 2022. ISSN 2666920X. doi:10.1016/j.caeai.2022.100081.
  35. From the Automated Assessment of Student Essay Content to Highly Informative Feedback: A Case Study. International Journal of Artificial Intelligence in Education, January 2024. ISSN 1560-4292, 1560-4306. doi:10.1007/s40593-023-00387-6.
  36. Evaluating ChatGPT’s Decimal Skills and Feedback Generation in a Digital Learning Game. In Olga Viberg, Ioana Jivet, Pedro J. Muñoz-Merino, Maria Perifanou, and Tina Papathoma, editors, Responsive and Sustainable Educational Futures, volume 14200, pages 278–293. Springer Nature Switzerland, Cham, 2023. ISBN 978-3-031-42681-0 978-3-031-42682-7. doi:10.1007/978-3-031-42682-7_19. URL https://link.springer.com/10.1007/978-3-031-42682-7_19. Series Title: Lecture Notes in Computer Science.
  37. Graphologue: Exploring Large Language Model Responses with Interactive Diagrams. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–20, San Francisco CA USA, October 2023. ACM. ISBN 9798400701320. doi:10.1145/3586183.3606737. URL https://dl.acm.org/doi/10.1145/3586183.3606737.
  38. How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models, September 2022. arXiv: 2209.01390 [cs] Issue: arXiv:2209.01390.
  39. Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–21, Hamburg Germany, April 2023. ACM. ISBN 978-1-4503-9421-5. doi:10.1145/3544548.3581388. URL https://dl.acm.org/doi/10.1145/3544548.3581388.
  40. Understanding Users’ Dissatisfaction with ChatGPT Responses: Types, Resolving Tactics, and the Effect of Knowledge Level, November 2023a. URL http://arxiv.org/abs/2311.07434. arXiv:2311.07434 [cs].
  41. EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria, September 2023b. URL http://arxiv.org/abs/2309.13633. arXiv:2309.13633 [cs].
  42. Why codesigning AI is different and difficult | ACM Interactions. https://interactions.acm.org/blog/view/why-codesigning-ai-is-different-and-difficult, 2023. Accessed: 2023-11-06.
  43. Beyond the Chat: Executable and Verifiable Text-Editing with LLMs. 2018.
  44. Prompting for Discovery: Flexible Sense-Making forAI Art-Making with DreamSheets. 2023.
  45. GenAssist: Making Image Generation Accessible, July 2023. URL http://arxiv.org/abs/2307.07589. arXiv:2307.07589 [cs].
  46. Opal: Multimodal image generation for news illustration. In Proceedings of the 35th annual ACM symposium on user interface software and technology, pages 1–17, 2022.
  47. Co-Writing Screenplays and Theatre Scripts with Language Models: Evaluation by Industry Professionals. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–34, Hamburg Germany, April 2023. ACM. ISBN 978-1-4503-9421-5. doi:10.1145/3544548.3581225. URL https://dl.acm.org/doi/10.1145/3544548.3581225.
  48. Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation. 2023b.
  49. Cells, generators, and lenses: Design framework for object-oriented interaction with large language models. In The 36th annual ACM symposium on user interface software and technology (UIST ’23), october 29-November 1, 2023, san francisco, CA, USA, UIST ’23, pages 1–18, New York, NY, USA, 2023c. Association for Computing Machinery. ISBN 9798400701320. doi:979-8-4007-0132-0/23/10. URL https://doi.org/10.1145/3586183.3606833. Number of pages: 18 Place: San Francisco, CA, USA.
  50. Cognitive Mirage: A Review of Hallucinations in Large Language Models, September 2023. URL http://arxiv.org/abs/2309.06794. arXiv:2309.06794 [cs].
  51. Supporting Sensemaking of Large Language Model Outputs at Scale, January 2024.
  52. Systematic literature review on opportunities, challenges, and future research recommendations of artificial intelligence in education. Computers and Education: Artificial Intelligence, 4:100118, 2023. ISSN 2666920X. doi:10.1016/j.caeai.2022.100118. URL https://linkinghub.elsevier.com/retrieve/pii/S2666920X2200073X.
  53. Beyond the Learning Analytics Dashboard: Alternative Ways to Communicate Student Data Insights Combining Visualisation, Narrative and Storytelling. In LAK22: 12th International Learning Analytics and Knowledge Conference, pages 219–229, Online USA, March 2022. ACM. ISBN 978-1-4503-9573-1. doi:10.1145/3506860.3506895.
  54. Paul A. Kirschner and Jeroen J.G. van Merriënboer. Do Learners Really Know Best? Urban Legends in Education. Educational Psychologist, 48(3):169–183, July 2013. ISSN 0046-1520, 1532-6985. doi:10.1080/00461520.2013.804395.
  55. Designing educational technologies in the age of AI: A learning sciences-driven approach. British Journal of Educational Technology, 50(6):2824–2838, November 2019. ISSN 0007-1013, 1467-8535. doi:10.1111/bjet.12861.
  56. Designing translucent learning analytics with teachers: An elicitation process. Interactive Learning Environments, pages 1–15, January 2020. ISSN 1049-4820, 1744-5191. doi:10.1080/10494820.2019.1710541.
  57. Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine. arXiv preprint arXiv:2311.16452, 2023. URL https://arxiv.org/abs/2311.16452.
  58. Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution, September 2023. arXiv: 2309.16797 [cs] Issue: arXiv:2309.16797.
  59. Cataloging Prompt Patterns to Enhance the Discipline of Prompt Engineering.
  60. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines, October 2023.
  61. Explainable Artificial Intelligence in education. Computers and Education: Artificial Intelligence, 3:100074, 2022. ISSN 2666920X. doi:10.1016/j.caeai.2022.100074.
  62. Explainable artificial intelligence (xai) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion, page 102301, 2024.
  63. Explainability for Large Language Models: A Survey. ACM Transactions on Intelligent Systems and Technology, page 3639372, January 2024. ISSN 2157-6904, 2157-6912. doi:10.1145/3639372.
  64. LLM-based NLG Evaluation: Current Status and Challenges, February 2024.
  65. Genaudit: Fixing factual errors in language model outputs with evidence. arXiv preprint arXiv:2402.12566, 2024.
  66. AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support, October 2023. URL http://arxiv.org/abs/2311.00710. arXiv:2311.00710 [cs].
  67. Bridging the Gulf of Envisioning: Cognitive Design Challenges in LLM Interfaces, September 2023. arXiv: 2309.14459 [cs] Issue: arXiv:2309.14459.
  68. Elena L. Glassman. Designing Interfaces for Human-Computer Communication: An On-Going Collection of Considerations, September 2023. URL http://arxiv.org/abs/2309.02257. arXiv:2309.02257 [cs].
  69. Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2):199–218, April 2006. ISSN 0307-5079, 1470-174X. doi:10.1080/03075070600572090.
  70. A Review of Feedback Models and Theories: Descriptions, Definitions, and Conclusions. Frontiers in Education, 6:720195, December 2021. ISSN 2504-284X. doi:10.3389/feduc.2021.720195.
  71. Conceptual framework for feedback automation in sles. In Vladimir L. Uskov, Robert J. Howlett, and Lakhmi C. Jain, editors, Smart Education and E-Learning 2016, volume 59, pages 97–107. Springer International Publishing, Cham, 2016. ISBN 978-3-319-39689-7 978-3-319-39690-3. doi:10.1007/978-3-319-39690-3_9.
  72. Human-centred learning analytics and ai in education: a systematic literature review. Computers and Education: Artificial Intelligence, 6:100215, 2024.
  73. Bo Zhang. Preparing Educators and Students for ChatGPT and AI Technology in Higher Education:Benefits, Limitations, Strategies, and Implications of ChatGPT & AI Technologies. 2023. doi:10.13140/RG.2.2.32105.98404. URL https://rgdoi.net/10.13140/RG.2.2.32105.98404. Publisher: Unpublished.
  74. The feedback triangle and the enhancement of dialogic feedback processes. Teaching in Higher Education, 18(3):285–297, April 2013. ISSN 1356-2517, 1470-1294. doi:10.1080/13562517.2012.719154.
  75. Programming Is Hard - Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, pages 500–506, Toronto ON Canada, March 2023. ACM. ISBN 978-1-4503-9431-4. doi:10.1145/3545945.3569759.
  76. Discovering statistics using R. W. Ross MacDonald School Resource Services Library, 2017.
  77. Using multivariate statistics, volume 6. pearson Boston, MA, 2013.
  78. Russell V. Lenth. emmeans: Estimated marginal means, aka least-squares means. manual, 2022. URL https://CRAN.R-project.org/package=emmeans.
  79. AI literacy in K-12: A systematic literature review. International Journal of STEM Education, 10(1):29, April 2023. ISSN 2196-7822. doi:10.1186/s40594-023-00418-7.
  80. Generative AI and the Workforce: What Are the Risks?, September 2023. URL https://papers.ssrn.com/abstract=4568684.
  81. Learning, teaching, and assessment with generative artificial intelligence: Towards a plateau of productivity. Learning: Research and Practice, 9(2):109–116, July 2023. ISSN 2373-5082. doi:10.1080/23735082.2023.2264086.
  82. Impact of AI assistance on student agency. Computers & Education, 210:104967, March 2024. ISSN 03601315. doi:10.1016/j.compedu.2023.104967.
  83. Cross-document event coreference resolution: Instruct humans or instruct GPT? In Jing Jiang, David Reitter, and Shumin Deng, editors, Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL), pages 561–574, Singapore, December 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.conll-1.38. URL https://aclanthology.org/2023.conll-1.38.
  84. “Your argumentation is good”, says the AI vs humans – The role of feedback providers and personalised language for feedback effectiveness. Computers and Education: Artificial Intelligence, 5:100189, 2023. ISSN 2666920X. doi:10.1016/j.caeai.2023.100189.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Stanislav Pozdniakov (1 paper)
  2. Jonathan Brazil (1 paper)
  3. Solmaz Abdi (2 papers)
  4. Aneesha Bakharia (2 papers)
  5. Shazia Sadiq (22 papers)
  6. Paul Denny (67 papers)
  7. Hassan Khosravi (12 papers)
  8. Dragan Gasevic (12 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com