Impact of Guidance and Interaction Strategies for LLM Use on Learner Performance and Perception (2310.13712v3)
Abstract: Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. LLMs offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction between learners and these models, which impact learners' engagement and results. We conducted a formative study in an undergraduate computer science classroom (N=145) and a controlled experiment on Prolific (N=356) to explore the impact of four pedagogically informed guidance strategies on the learners' performance, confidence and trust in LLMs. Direct LLM answers marginally improved performance, while refining student solutions fostered trust. Structured guidance reduced random queries as well as instances of students copy-pasting assignment questions to the LLM. Our work highlights the role that teachers can play in shaping LLM-supported learning environments.
- [n. d.]. Bing Chat. https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8. Accessed: 2023-08-18.
- [n. d.]. ChatGPT. https://openai.com/chatgpt. Accessed: 2023-08-18.
- [n. d.]. An important next step on our AI journey. https://blog.google/technology/ai/bard-google-ai-search-updates/. Accessed: 2023-08-18.
- 2023. Practical AI for Instructors and Students Part 1: Introduction to AI for Teachers and Students. https://www.youtube.com/watch?v=t9gmyvf7JYo&list=PLwRdpYzPkkn302_rL5RrXvQE8j0jLP02j&index=2
- Challenges in chatbot development: A study of stack overflow posts. In Proceedings of the 17th international conference on mining software repositories. 174–185.
- Gary L Adams and Siegfried Engelmann. 1996. Research on Direct Instruction: 25 years beyond DISTAR. ERIC.
- Kaamil Ahmed. 2023. World needs 44m more teachers in order to educate every child, report finds. The Guardian (October 2023). https://www.theguardian.com/link-to-the-specific-article
- Dorit Alt and Nirit Raichel. 2020. Reflective journaling and metacognitive awareness: Insights from a longitudinal study in higher education. Reflective Practice 21, 2 (2020), 145–158.
- Cognitive tutors: Lessons learned. The journal of the learning sciences 4, 2 (1995), 167–207.
- Supporting active learning and example based instruction with classroom technology. Acm Sigcse Bulletin 39, 1 (2007), 69–73.
- David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Available at SSRN 4337484 (2023).
- Albert Bandura. 1982. Self-efficacy mechanism in human agency. American psychologist 37, 2 (1982), 122.
- A systematic review of research on personalized learning: Personalized by whom, to what, how, and for what purpose (s)? Educational Psychology Review 33, 4 (2021), 1675–1715.
- Benjamin S Bloom. 1984. The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational researcher 13, 6 (1984), 4–16.
- The double-edged sword of pedagogy: Instruction limits spontaneous exploration and discovery. Cognition 120, 3 (2011), 322–330.
- Teachers’ developing ideas and practices about mathematics performance assessment: Successes, stumbling blocks, and implications for professional development. Teaching and Teacher education 13, 3 (1997), 259–278.
- Tom Bourner. 2003. Assessing reflective learning. Education+ training 45, 5 (2003), 267–272.
- The Community Builder (CoBi): Helping Students to Develop Better Small Group Collaborative Learning Skills. In Companion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing (Minneapolis, MN, USA) (CSCW ’23 Companion). Association for Computing Machinery, New York, NY, USA, 376–380. https://doi.org/10.1145/3584931.3607498
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- Comparing self-guided learning and educator-guided learning formats for simulation-based clinical training. Journal of advanced nursing 66, 8 (2010), 1832–1844.
- ” This means nothing to me”: Building credibility in conversational systems. In Proceedings of the 5th International Conference on Conversational User Interfaces. 1–6.
- How is ChatGPT’s behavior changing over time? arXiv preprint arXiv:2307.09009 (2023).
- Learning from human tutoring. Cognitive science 25, 4 (2001), 471–533.
- Communication and learning in task-oriented groups. (1952).
- What makes a good conversation? Challenges in designing truly conversational agents. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–12.
- APA handbook of research methods in psychology, Vol 2: Research designs: Quantitative, qualitative, neuropsychological, and biological. American Psychological Association.
- Albert T Corbett and John R Anderson. 2001. Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. In Proceedings of the SIGCHI conference on Human factors in computing systems. 245–252.
- Kahneman Daniel. 2017. Thinking, fast and slow.
- Evaluating Crowdworkers as a Proxy for Online Learners in Video-Based Learning Contexts. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 42 (nov 2018), 16 pages. https://doi.org/10.1145/3274311
- Rethinking Conversational Agents in the Era of LLMs: Proactivity, Non-collaborativity, and Beyond. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 298–301.
- Can We Trust AI-Generated Educational Content? Comparative Analysis of Human and AI-Generated Learning Resources. arXiv preprint arXiv:2306.10509 (2023).
- Promptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generators. arXiv preprint arXiv:2307.16364 (2023).
- From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing. Biology of Sport 40, 2 (2023), 615–622.
- Mapping perceptions of humanness in intelligent personal assistant interaction. In Proceedings of the 21st international conference on human-computer interaction with mobile devices and services. 1–12.
- Sabit Ekin. 2023. Prompt Engineering For ChatGPT: A Quick Guide To Techniques, Tips, And Best Practices. (2023).
- An analysis of research on metacognitive teaching strategies. Procedia-Social and Behavioral Sciences 116 (2014), 4015–4024.
- A SWOT analysis of ChatGPT: Implications for educational practice and research. Innovations in Education and Teaching International (2023), 1–15.
- Programming without a Programming Language: Challenges and Opportunities for Designing Developer Tools for Prompt Programming. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. 1–7.
- The role of MTurk in education research: Advantages, issues, and future directions. Educational Researcher 46, 6 (2017), 329–334.
- Data driven automatic feedback generation in the iList intelligent tutoring system. Technology, Instruction, Cognition and Learning 10, 1 (2015), 5–26.
- Direct instruction: A research-based approach to curriculum design and teaching. Exceptional Children 53, 1 (1986), 17–31.
- OverCode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI) 22, 2 (2015), 1–35.
- Level design patterns that invoke curiosity-driven exploration: An empirical study across multiple conditions. Proceedings of the ACM on Human-Computer Interaction 5, CHI PLAY (2021), 1–32.
- Confidence-competence alignment and the role of self-confidence in medical education: A conceptual review. Medical Education 56, 1 (2022), 37–47.
- Personalized attention@ scale: Talk isn’t cheap, but it’s effective. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education. 610–615.
- Aligning AI With Shared Human Values. Proceedings of the International Conference on Learning Representations (ICLR) (2021).
- Measuring Massive Multitask Language Understanding. Proceedings of the International Conference on Learning Representations (ICLR) (2021).
- Douglas R Hofstadter. 1995. Fluid concepts and creative analogies: Computer models of the fundamental mechanisms of thought. Basic books.
- Jay W Jackson. 2002. Enhancing self-efficacy and learning performance. The journal of experimental education 70, 3 (2002), 243–254.
- Stuart A. Karabenick and Myron H. Dembo. 2011. Understanding and facilitating self‐regulated help seeking. New Directions for Teaching and Learning 2011, 126 (June 2011), 33–43. https://doi.org/10.1002/tl.442
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103 (2023), 102274.
- Frank C Keil and Robert Andrew Wilson. 2000. Explanation and cognition. MIT press.
- Dongho Kim and Cheolil Lim. 2018. Promoting socially shared metacognitive regulation in collaborative project-based learning: a framework for the design of structured guidance. Teaching in Higher Education 23, 2 (2018), 194–211.
- Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI conference on human factors in computing systems. 453–456.
- Math Education with Large Language Models: Peril or Promise? Available at SSRN 4641653 (2023).
- Exploring the potential of chatbots to provide mental well-being support for computer science students. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 2. 1339–1339.
- Georgia Lazakidou and Symeon Retalis. 2010. Using computer supported collaborative learning strategies for helping students acquire self-regulated problem-solving skills in mathematics. Computers & Education 54, 1 (2010), 3–13.
- Evaluating human-language model interaction. arXiv preprint arXiv:2212.09746 (2022).
- Yo-An Lee. 2004. The work of examples in classroom instruction. Linguistics and education 15, 1-2 (2004), 99–120.
- Exploring Design Opportunities for Reflective Conversational Agents to Reduce Compulsive Smartphone Use. In Proceedings of the 5th International Conference on Conversational User Interfaces. 1–6.
- Proactive conversational agents in the post-chatgpt world. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3452–3455.
- Self-Evolving Adaptive Learning for Personalized Education. In Companion Publication of the 2020 Conference on Computer Supported Cooperative Work and Social Computing (Virtual Event, USA) (CSCW ’20 Companion). Association for Computing Machinery, New York, NY, USA, 317–321. https://doi.org/10.1145/3406865.3418326
- Vivian Liu and Lydia B Chilton. 2022. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–23.
- Jennifer A Livingston. 2003. Metacognition: An Overview. (2003).
- Howard Margolis*. 2005. Increasing struggling learners’ self-efficacy: What tutors can do and say. Mentoring & Tutoring: Partnership in Learning 13, 2 (2005), 221–238.
- Cognitive factors in the explanation of the mismatch between confidence and competence in performing basic life support. Psychology and Health 3, 3 (1989), 173–182.
- An evaluation of the impact of automated programming hints on performance and learning. In Proceedings of the 2019 ACM Conference on International Computing Education Research. 61–70.
- Tutoring: Guided learning by doing. Cognition and instruction 13, 3 (1995), 315–372.
- Ricardo dos Santos Miquelino. [n. d.]. Better results with CHATGPT? – it’s all about asking the right questions. https://www.linkedin.com/pulse/better-results-chatgpt-its-all-asking-right-questions-ricardo/
- Ethan Mollick. 2023a. https://www.oneusefulthing.org/p/all-my-classes-suddenly-became-ai
- Ethan Mollick. 2023b. Assigning AI: Seven ways of using AI in class. https://www.oneusefulthing.org/p/assigning-ai-seven-ways-of-using
- Trevor T. Moores and Jerry Cha-Jan Chang. 2009. Self-efficacy, overconfidence, and the negative effect on subsequent performance: A field study. Information & Management 46, 2 (March 2009), 69–76. https://doi.org/10.1016/j.im.2008.11.006
- Ha Nguyen. 2023. Role design considerations of conversational agents to facilitate discussion and systems thinking. Computers & Education 192 (2023), 104661.
- Stefan Palan and Christian Schitter. 2018. Prolific. ac—A subject pool for online experiments. Journal of Behavioral and Experimental Finance 17 (2018), 22–27.
- Annemarie Sullivan Palincsar. 1986. Metacognitive strategy instruction. Exceptional children 53, 2 (1986), 118–124.
- Eileen Pangu. 2023. How to use large language models (LLM) in your own domains. https://towardsdatascience.com/how-to-use-large-language-models-llm-in-your-own-domains-b4dff2d08464
- Rediscovering the use of chatbots in education: A systematic literature review. Computer Applications in Engineering Education 28, 6 (2020), 1549–1565.
- Md Mostafizer Rahman and Yutaka Watanobe. 2023. ChatGPT for education and research: Opportunities, threats, and strategies. Applied Sciences 13, 9 (2023), 5783.
- Leena Razzaq and Neil T Heffernan. 2010. Hints: is it better to give or wait to be asked?. In Intelligent Tutoring Systems: 10th International Conference, ITS 2010, Pittsburgh, PA, USA, June 14-18, 2010, Proceedings, Part I 10. Springer, 349–358.
- ReaderQuizzer: Augmenting Research Papers with Just-In-Time Learning Questions to Facilitate Deeper Understanding. In Companion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing (Minneapolis, MN, USA) (CSCW ’23 Companion). Association for Computing Machinery, New York, NY, USA, 391–394. https://doi.org/10.1145/3584931.3607494
- Kelly Rivers. 2017. Automated data-driven hint generation for learning programming. Ph. D. Dissertation. Carnegie Mellon University.
- Scale-driven automatic hint generation for coding style. In Intelligent Tutoring Systems: 13th International Conference, ITS 2016, Zagreb, Croatia, June 7-10, 2016. Proceedings 13. Springer, 122–132.
- Dale H Schunk. 1985. Self-efficacy and classroom learning. Psychology in the Schools 22, 2 (1985), 208–223.
- Harriet L Schwartz and Elizabeth L Holloway. 2014. “I Become a Part of the Learning Process”: Mentoring Episodes and Individualized Attention in Graduate Education. Mentoring & Tutoring: Partnership in Learning 22, 1 (2014), 38–55.
- Terrence J Sejnowski. 2023. Large language models and the reverse turing test. Neural computation 35, 3 (2023), 309–342.
- H Şenay Şen. 2009. The relationsip between the use of metacognitive strategies and reading comprehension. Procedia-Social and Behavioral Sciences 1, 1 (2009), 2301–2305.
- Lorrie Shepard. 2001. The role of classroom assessment in teaching and learning. (2001).
- Tuva Lunde Smestad and Frode Volden. 2019. Chatbot personalities matters: improving the user experience of chatbot interfaces. In Internet Science: INSCI 2018 International Workshops, St. Petersburg, Russia, October 24–26, 2018, Revised Selected Papers 5. Springer, 170–181.
- EduFeed: A Social Feed to Engage Preliterate Children in Educational Activities. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (Portland, Oregon, USA) (CSCW ’17). Association for Computing Machinery, New York, NY, USA, 491–504. https://doi.org/10.1145/2998181.2998231
- John Sweller. 2006. The worked example effect and human cognition. Learning and instruction (2006).
- Humans monitor learning progress in curiosity-driven exploration. Nature communications 12, 1 (2021), 5972.
- UNESCO. 2022. World Teachers’ Day: UNESCO sounds the alarm on the global teacher shortage crisis. UNESCO (October 2022). https://www.unesco.org/en/articles/world-teachers-day-unesco-sounds-alarm-global-teacher-shortage-crisis
- Professional development and reform in science education: The role of teachers’ practical knowledge. Journal of Research in Science Teaching: The Official Journal of the National Association for Research in Science Teaching 38, 2 (2001), 137–158.
- Tamara Van Gog and Nikol Rummel. 2010. Example-based learning: Integrating cognitive and social-cognitive research perspectives. Educational psychology review 22 (2010), 155–174.
- Two studies examining the negative effect of self-efficacy on performance. Journal of Applied Psychology 87, 3 (2002), 506–516. https://doi.org/10.1037/0021-9010.87.3.506
- CSCW and Education: Viewing Education as a Site of Work Practice. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work Companion (San Antonio, Texas, USA) (CSCW ’13). Association for Computing Machinery, New York, NY, USA, 333–336. https://doi.org/10.1145/2441955.2442035
- An Interaction Design for Machine Teaching to Develop AI Tutors. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3313831.3376226
- William AT White. 1988. A meta-analysis of the effects of direct instruction in special education. Education and Treatment of children (1988), 364–374.
- Axis: Generating explanations at scale with learnersourcing and machine learning. In Proceedings of the Third (2016) ACM Conference on Learning@ Scale. 379–388.
- Revising learner misconceptions without feedback: Prompting for reflection on anomalies. In Proceedings of the 2016 CHI conference on human factors in computing systems. 470–474.
- Jörg Wittwer and Alexander Renkl. 2010. How effective are instructional explanations in example-based learning? A meta-analytic review. Educational Psychology Review 22 (2010), 393–409.
- Exploring Users’ Preferences for Chatbot’s Guidance Type and Timing. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing (Virtual Event, USA) (CSCW ’21 Companion). Association for Computing Machinery, New York, NY, USA, 191–194. https://doi.org/10.1145/3462204.3481756
- Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–22.
- Hong Yang. 2023. How I use ChatGPT responsibly in my teaching. Nature (2023).
- Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–21.
- Andreas Zendler and K Klein. 2018. The effect of direct instruction and web quest on learning outcome in computer science education. Education and Information Technologies 23 (2018), 2765–2782.
- Understanding the implementation of personalized learning: A research synthesis. Educational Research Review 31 (2020), 100339.
- Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910 (2022).
- Action-a-Bot: Exploring Human-Chatbot Conversations for Actionable Instruction Giving and Following. In Companion Publication of the 2022 Conference on Computer Supported Cooperative Work and Social Computing. 145–149.
- Harsh Kumar (54 papers)
- Ilya Musabirov (9 papers)
- Mohi Reza (7 papers)
- Jiakai Shi (5 papers)
- Xinyuan Wang (34 papers)
- Joseph Jay Williams (32 papers)
- Anastasia Kuzminykh (13 papers)
- Michael Liut (14 papers)