Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education (2403.08272v1)

Published 13 Mar 2024 in cs.CL

Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, students engaged in dialogues with ChatGPT to revise their essays. RECIPE4U includes comprehensive records of these interactions, including conversation logs, students' intent, students' self-rated satisfaction, and students' essay edit histories. In particular, we annotate the students' utterances in RECIPE4U with 13 intention labels based on our coding schemes. We establish baseline results for two subtasks in task-oriented dialogue systems within educational contexts: intent detection and satisfaction estimation. As a foundational step, we explore student-ChatGPT interaction patterns through RECIPE4U and analyze them by focusing on students' dialogue, essay data statistics, and students' essay edits. We further illustrate potential applications of RECIPE4U dataset for enhancing the incorporation of LLMs in educational frameworks. RECIPE4U is publicly available at https://zeunie.github.io/RECIPE4U/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Maureen P. Boyd. 2015. Relations Between Teacher Questioning and Student Talk in One Elementary ELL Classroom. Journal of Literacy Research, 47(3):370–404.
  2. Dialogue act modeling in a complex task-oriented domain. In Proceedings of the SIGDIAL 2010 Conference, pages 297–305, Tokyo, Japan. Association for Computational Linguistics.
  3. YS Cheng. 2004. Efl students’ writing anxiety: Sources and implications. English Teaching & Learning, 29(2):41–62.
  4. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
  5. Measuring conversational uptake: A case study on student-teacher interactions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1638–1653, Online. Association for Computational Linguistics.
  6. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  7. Suman Dowlagar and Radhika Mamidi. 2023. A code-mixed task-oriented dialog dataset for medical domain. Comput. Speech Lang., 78(C).
  8. Simone Grassini. 2023. Shaping the Future of Education: Exploring the Potential and Consequences of AI and ChatGPT in Educational Settings. Education Sciences, 13(7).
  9. Combining verbal and nonverbal features to overcome the “information gap” in task-oriented dialogue. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 247–256, Seoul, South Korea. Association for Computational Linguistics.
  10. RECIPE: How to Integrate ChatGPT into EFL Writing Education. In Proceedings of the Tenth ACM Conference on Learning @ Scale, L@S ’23, page 416–420, New York, NY, USA. Association for Computing Machinery.
  11. Fabric: Automated scoring and feedback generation for essays.
  12. A simple language model for task-oriented dialogue. In Advances in Neural Information Processing Systems, volume 33, pages 20179–20191. Curran Associates, Inc.
  13. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103:102274.
  14. Learning from Teaching Assistants to Program with Subgoals: Exploring the Potential for AI Teaching Assistants.
  15. Bitod: A bilingual multi-domain dataset for task-oriented dialogue modeling. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1. Curran.
  16. ReadingQuizMaker: A Human-NLP Collaborative System That Supports Instructors to Design High-Quality Reading Quiz Questions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA. Association for Computing Machinery.
  17. Classification of speech acts in tutorial dialog. In Proceedings of the Workshop on Modeling Human Teaching Tactics and Strategies of ITS 2000, pages 65–71.
  18. GPTeach: Interactive TA Training with GPT-Based Students. In Proceedings of the Tenth ACM Conference on Learning @ Scale, L@S ’23, page 226–236, New York, NY, USA. Association for Computing Machinery.
  19. Neil Mercer. 2008. The seeds of time: Why classroom dialogue needs a temporal analysis. Journal of the Learning Sciences, 17(1):33–59.
  20. Cagri Ozkose-Biyik and Carla Meskill. 2015. Plays Well With Others: A Study of EFL Learner Reciprocity in Action. TESOL Quarterly, 49(4):787–813.
  21. What makes an AI device human-like? The role of interaction quality, empathy and perceived psychological anthropomorphic characteristics in the acceptance of artificial intelligence in the service industry. Computers in Human Behavior, 122:106855.
  22. Junaid Qadir. 2023. Engineering Education in the Era of ChatGPT: Promise and Pitfalls of Generative AI for Education. In 2023 IEEE Global Engineering Education Conference (EDUCON), pages 1–9.
  23. Student speech act classification using machine learning. In Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference.
  24. Cross-lingual transfer learning for multilingual task oriented dialog. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3795–3805, Minneapolis, Minnesota. Association for Computational Linguistics.
  25. Suardani Silaban and Tiarma Intan Marpaung. 2020. An analysis of code-mixing and code-switching used by indonesia lawyers club on tv one. Journal of English Teaching as a Foreign Language, 6(3):1–17.
  26. The effects of an awe-aided assessment approach on business english writing performance and writing anxiety: A contextual consideration. Studies in Educational Evaluation, 72:101123.
  27. Editorial: ChatGPT: Challenges, Opportunities, and Implications for Teacher Education. Contemporary Issues in Technology and Teacher Education, 23(1):1–23.
  28. MeDAL: Medical abbreviation disambiguation dataset for natural language understanding pretraining. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, pages 130–135, Online. Association for Computational Linguistics.
  29. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces.
  30. Frames: a corpus for adding memory to goal-oriented dialogue systems. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 207–219, Saarbrücken, Germany. Association for Computational Linguistics.
  31. Key-value retrieval networks for task-oriented dialogue. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 37–49, Saarbrücken, Germany. Association for Computational Linguistics.
  32. The ATIS spoken language systems pilot corpus. In Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990.
  33. The second dialog state tracking challenge. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pages 263–272, Philadelphia, PA, U.S.A. Association for Computational Linguistics.
  34. Building a conversational agent overnight with dialogue self-play.
  35. MultiWOZ 2.2 : A dialogue dataset with additional annotation corrections and state tracking baselines. In Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI, pages 109–117, Online. Association for Computational Linguistics.
  36. GrounDialog: A dataset for repair and grounding in task-oriented spoken dialogues for language learning. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 300–314, Toronto, Canada. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jieun Han (12 papers)
  2. Haneul Yoo (21 papers)
  3. Junho Myung (14 papers)
  4. Minsun Kim (17 papers)
  5. Tak Yeon Lee (14 papers)
  6. So-Yeon Ahn (8 papers)
  7. Alice Oh (82 papers)
Citations (4)

Summary

Exploring the Dynamics of Student-ChatGPT Interactions in EFL Writing Education with RECIPE4U

Introduction to RECIPE4U

The introduction of the RECIPE (Revising an Essay with ChatGPT on an Interactive Platform for EFL learners) platform represents a significant stride in leveraging LLMs for educational purposes, specifically within the field of English as a Foreign Language (EFL) writing. Developed by a research team led by Jieun Han and Haneul Yoo at KAIST, South Korea, RECIPE4U serves as a comprehensive dataset capturing semester-long interactions between EFL learners and ChatGPT. This dataset not only empowers the detection of students' intents and their satisfaction levels with ChatGPT responses but also provides a foundation for understanding and enhancing LLM-integrated education.

Methodology and Dataset Overview

RECIPE4U is sourced from 212 EFL students over a semester, marking it as one of the first platforms to scrutinize real-world student-ChatGPT dialogues in the context of EFL education. The dataset is distinct in its inclusion of conversation logs, student intents, satisfaction ratings, and detailed essay edit histories. This multifaceted approach facilitates an in-depth analysis of student interaction patterns, informing the development of more effective educational tools.

The Core Contributions

  • Dataset Creation: The RECIPE4U dataset embodies a vast collection of dialogues between students and ChatGPT, covering varied educational contexts. By focusing on the semester-long educational journey, the dataset provides a rich resource for examining the utility and implications of ChatGPT in EFL writing instruction.
  • Baseline Models: The research introduces baseline models for intent detection and satisfaction estimation within this dataset. These models serve as a benchmark for future studies to improve upon, fostering advancements in educational applications of LLMs.
  • Interaction Analysis: An exploration of students' interaction patterns with ChatGPT is provided. The insights drawn from conversation logs and essay edits offer valuable implications for the design of LLM-enhanced educational technologies.

Key Findings and Implications

A detailed examination of the RECIPE4U dataset reveals that students not only sought linguistic corrections from ChatGPT but also engaged in elaborate discussions regarding essay content and organization. The analysis also uncovers a pattern of anthropomorphizing ChatGPT, where students treated the AI as a peer, suggesting a potential pathway to more engaging and less intimidating educational environments.

From an educational standpoint, the ability to detect student intent and satisfaction with AI responses presents a novel avenue for personalized learning experiences. Additionally, the dataset highlights areas for improvement in AI-driven educational tools, such as the need for more accurate and context-sensitive feedback mechanisms.

Predictions for Future Developments

The data derived from RECIPE4U paves the way for numerous future explorations in AI-enhanced education. It's conceivable that subsequent research will focus on optimizing LLM interactions tailored to individual learning styles and preferences, thereby increasing the efficacy of AI tutors in EFL and beyond. Furthermore, the role of AI in facilitating peer-like interactions suggests a transformative potential for digital learning, breaking down traditional barriers to student engagement.

Conclusion

RECIPE4U stands as a pioneering dataset, offering unprecedented insights into the interactions between EFL learners and ChatGPT. This research not only provides a solid foundation for future technological enhancements in LLM-integrated education but also encapsulates a critical shift towards a more interactive, personalized, and effective learning landscape. As LLMs continue to evolve, their integration into educational platforms, as exemplified by RECIPE4U, will undoubtedly reshape the terrain of language learning and teaching methodologies.