A review on the use of large language models as virtual tutors

Published 20 May 2024 in cs.CL and cs.AI | (2405.11983v2)

Abstract: Transformer architectures contribute to managing long-term dependencies for Natural Language Processing, representing one of the most recent changes in the field. These architectures are the basis of the innovative, cutting-edge LLMs that have produced a huge buzz in several fields and industrial sectors, among the ones education stands out. Accordingly, these generative Artificial Intelligence-based solutions have directed the change in techniques and the evolution in educational methods and contents, along with network infrastructure, towards high-quality learning. Given the popularity of LLMs, this review seeks to provide a comprehensive overview of those solutions designed specifically to generate and evaluate educational materials and which involve students and teachers in their design or experimental plan. To the best of our knowledge, this is the first review of educational applications (e.g., student assessment) of LLMs. As expected, the most common role of these systems is as virtual tutors for automatic question generation. Moreover, the most popular models are GTP-3 and BERT. However, due to the continuous launch of new generative models, new works are expected to be published shortly.

Abstract PDF HTML Upgrade to Chat

Authors (3)

References (19)

Citations (6)

View on Semantic Scholar

Summary

The paper reviews LLMs as virtual tutors, highlighting their roles in question generation, answer grading, and explanation with insights drawn from 29 selected records.
It details a systematic review methodology using targeted Google Scholar queries post-2020, emphasizing rigorous screening and performance evaluations.
The study discusses practical challenges such as ethical concerns, fairness, and integration issues, urging further research with emerging models like GPT-4.

Review of LLMs as Virtual Tutors

Introduction

LLMs, grounded in Transformer architectures, have become increasingly significant within the field of NLP. These models, notably including GPT-3 and BERT, have permeated various sectors, prominently education, where they serve as virtual tutors. This review paper provides a comprehensive overview of LLMs tailored for educational purposes, focusing on their role in generating and evaluating educational materials, and highlights their involvement in both student and teacher design or experimental plans.

Methodology

The authors conducted a systematic review via Google Scholar using specific search queries to identify scholarly works related to the educational application of LLMs. The search queries focused on documents published since 2020, emphasizing works that explicitly employed LLMs for educational purposes, excluding those merely adaptable to educational contexts.

Figure 1: Review pipeline.

A comprehensive screening process yielded records based on English language and applicability within the educational domain, culminating in 29 records applicable to this review.

Applications of LLMs in Education

Virtual Tutoring Applications

LLMs have found success as virtual tutors, predominantly utilizing models like GPT-3 and BERT, for various tasks including automatic question generation, answer grading, explanation generation, and text summarization.

Figure 2: Distribution of the LLMs in the records selected.

These models offer functionalities such as:

Question Generation: Solutions leveraging GPT-3 and T5 have shown a notable percentage of generated questions being deemed useful, with manual assessments corroborating these findings.
Answer Grading: BERT's employment and modifications for answer evaluations demonstrate an adequate Pearson correlation coefficient, signifying its efficacy in grading student responses accurately.
Explanations: The potential of GPT-3.5 has been explored for generating qualitative conclusion statements in educational settings, underscoring features like readability and orthographic correctness.

Other Educational Applications

Beyond tutoring, LLMs have been employed for tasks like code explanation and correction. For instance, models like Codex have demonstrated capability in addressing syntax errors and offering explanations in programming assignments. Additionally, virtual assistants powered by variations of BERT and GPT models have shown effectiveness in aiding both academic and administrative tasks.

Figure 3: Distribution of the tasks in the records selected.

Discussion and Analysis

The widespread application of LLMs in education showcases their adaptability and potential in transforming traditional educational methodologies. However, challenges persist regarding their integration and the requirement for transparency, especially in scenarios where difficulty adjustment and fairness are pivotal. Despite the advantages, the ethical considerations, and risks such as academic integrity and the models' opaqueness, highlight the need for rigorous scrutiny.

Conclusion

LLMs are undeniably at the forefront of revolutionizing educational practices. Their generative capabilities allow for enhanced pedagogical approaches and individualized assessments. This review serves as a foundational exploration into the myriad applications of LLMs within education and anticipates further developments with the advent of newer models like GPT-4. Future research should aim to address ethical implications and practical integration into educational ecosystems, ensuring these technologies are harnessed effectively and responsibly.

Markdown Report Issue