AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails (2402.09216v3)

Published 14 Feb 2024 in cs.CL and cs.HC

Abstract: LLMs have found several use cases in education, ranging from automatic question generation to essay evaluation. In this paper, we explore the potential of using LLMs to author Intelligent Tutoring Systems. A common pitfall of LLMs is their straying from desired pedagogical strategies such as leaking the answer to the student, and in general, providing no guarantees. We posit that while LLMs with certain guardrails can take the place of subject experts, the overall pedagogical design still needs to be handcrafted for the best learning results. Based on this principle, we create a sample end-to-end tutoring system named MWPTutor, which uses LLMs to fill in the state space of a pre-defined finite state transducer. This approach retains the structure and the pedagogy of traditional tutoring systems that has been developed over the years by learning scientists but brings in additional flexibility of LLM-based approaches. Through a human evaluation study on two datasets based on math word problems, we show that our hybrid approach achieves a better overall tutoring score than an instructed, but otherwise free-form, GPT-4. MWPTutor is completely modular and opens up the scope for the community to improve its performance by improving individual modules or using different teaching strategies that it can follow.

PDF HTML Abstract

Overview of "Scaling the Authoring of AutoTutors with LLMs"

The paper, "Scaling the Authoring of AutoTutors with LLMs," provides a comprehensive exploration of leveraging LLMs for authoring Intelligent Tutoring Systems (ITSs). The authors recognize the potential of LLMs in educational applications and address the limitations of traditional rule-based ITSs, such as AutoTutor, which often require extensive manual labor and face scalability challenges.

Key Contributions

Hybrid Tutoring System - \MB: The authors introduce a new tutoring system named \MB, which incorporates LLMs to fill the state space of a pre-defined finite state transducer. This approach synergizes the structured pedagogy of traditional systems with the flexibility provided by LLMs.
Maintenance of Pedagogical Integrity: While LLMs are employed to generate content, the overarching pedagogical architecture is crafted by learning experts. This ensures that the educational strategies remain robust and are not degraded by the inherent unpredictability of LLM outputs.
Human Evaluation Study: A comparative evaluation using math word problems demonstrates that \MB achieves higher tutoring scores than a free-form GPT-4, maintaining educational efficacy while being modular and extensible.

Numerical Results

The paper presents quantitative evaluations showing that \MB, utilizing GPT-4 as a backend, achieves a perfect success rate on simpler datasets such as MultiArith. It performs competitively on more complex datasets like MathDial, with notable improvements over baseline GPT-4 performance. The system's modular design facilitates further enhancements by allowing adjustments to individual components and teaching strategies.

Implications for AI and Education

Theoretical Implications: This research highlights the potential for hybrid models that combine AI with traditional educational frameworks. By employing LLMs primarily for content generation rather than overall pedagogy, it is possible to harness AI capabilities without compromising on educational integrity.
Practical Implications: The use of LLMs in generating tutoring content has the potential to significantly reduce the time and resources needed for developing ITSs. This adaptability can enhance the scalability of educational technologies, allowing for wider implementation across diverse subject areas.
Speculation on Future Developments: Future advances in AI could further optimize the integration of LLMs into ITSs, potentially allowing for dynamic pedagogical strategies that adapt in real-time to student needs. As AI models become more transparent and controllable, trust and reliability in educational settings could improve, expanding the scope of these applications.

Conclusion

This paper provides valuable insights into the implementation and evaluation of LLM-based components within Intelligent Tutoring Systems. By maintaining a focus on pedagogical soundness while benefiting from the generative capabilities of LLMs, the authors demonstrate a balanced and effective approach to modern educational challenges. Further research could build on these findings, exploring diverse domains and enhancing the interactive capabilities of such systems. The modular and adaptable nature of \MB provides a promising foundation for future developments in AI-driven education.

PDF Markdown Bookmark Chat (Pro)

References (33)

Authors (3)

Sankalan Pal Chowdhury (9 papers)
Vilém Zouhar (41 papers)
Mrinmaya Sachan (124 papers)

Citations (4)

View on Semantic Scholar