- The paper demonstrates that leveraging LLMs as simulated students in learning-by-teaching significantly improves both immediate and delayed vocabulary retention.
- The paper outlines an innovative system architecture, using GPT-4o to generate context-aware questions that stimulate active learner engagement and detailed responses.
- The paper highlights practical benefits, including personalized pacing and scalability, while addressing challenges of cognitive overload and repetitive questioning.
LLMs as Conversational Students in Learning by Teaching for Vocabulary Acquisition
Learning by Teaching in Digital Education
Learning by Teaching (LbT) has been theoretically and empirically validated as an effective mechanism to enhance comprehension and retention in educational settings, particularly by requiring learners to articulate knowledge and address questions posed by others. Traditional implementations introduce logistical and psychological barriers, such as the stress of peer interaction and the challenge of sourcing suitable learners for explanations. This study addresses these bottlenecks by simulating the student role with LLMs for interactive, scalable LbT in English vocabulary acquisition (2604.17893).
System Architecture: LLMs as Simulated Students
The system leverages GPT-4o as a dynamic question generator and conversational partner, emulating a beginner-level English student. At each learning iteration, participants correct a deliberately erroneous sentence containing the target vocabulary or idiom and provide explanations. The LLM, prompted with student-like characteristics and prior learner responses, generates contextually relevant questions. Learners answer these questions, thus iteratively reinforcing conceptual understanding and addressing knowledge gaps.
Figure 1: The operational flow of the proposed LbT system, integrating LLMs as conversational students for adaptive question generation.
In contrast, the baseline system omits LLM-generated interactions, focusing solely on correcting sentences without conversational reinforcement.
Figure 2: The baseline learning system, devoid of LLM-mediated question generation and interaction.
Experimental Protocol and Interface
Ten university students participated in a crossover design consisting of pretest, LbT interactions, and multiple posttests administered at intervals (immediate, three days, seven days). The learning interface presented multiple-choice questions based on vocabulary from standardized proficiency tests (Eiken), with results and justifications provided per session.

Figure 3: User interface for the pretest phase, featuring multiple-choice selection and immediate feedback.
Memory Retention and Learning Outcomes
Analysis of posttest scores demonstrates clear superiority of the proposed system over the baseline in both immediate and delayed recall. The system achieves improved retention at three and seven days post-learning, with marked gains in correct answer percentages.
Figure 4: Differential performance in percentage of correct answers between the LLM-mediated system and the baseline across successive test intervals.
The gain is particularly pronounced in users who actively engage in the conversational loop, submitting detailed input and maintaining a balanced cognitive load. The correlation between number of words entered per interaction and learning outcome highlights the role of cognitive engagement.
Figure 5: Relationship between average word count per interaction and overall system engagement.
Interaction Analysis and System Limitations
While most participants benefited from the LLM-mediated approach, heterogeneity in interaction behavior led to varying outcomes. Some users experienced cognitive overload, especially when exposed to excessive question generation or attempted to learn large volumes simultaneously, resulting in diminished performance (notably participant p5). Additionally, deterministic settings of GPT-4o (temperature set to zero) led to repetitive questioning, reducing perceived conversational diversity—a problem, as evidenced by duplicated questions documented in the appendix.
Practical and Theoretical Implications
This study establishes LLMs as viable, adaptive, and cost-effective simulated students for LbT paradigms. The system is generalizable beyond vocabulary acquisition, supporting scalable deployment across domains requiring explanatory learning. Practically, LLM-mediated LbT circumvents psychological and logistical challenges inherent to human peer teaching, enabling personalized pacing and content adaptation.
Theoretically, engagement with simulated students via LLMs aligns with constructivist pedagogical principles, as learners must actively resolve knowledge discrepancies. The observed retention improvements substantiate the hypothesis that dynamic interaction, rather than rote correction, promotes deeper memory encoding. Importantly, tailoring conversational complexity and feedback to individual learner characteristics is critical, and future research should focus on adaptive calibration via cognitive modeling.
Future Directions
Extending the system to larger cohorts and diverse domains will enhance statistical reliability and explore cross-linguistic generalization. Automated adjustments of conversational difficulty and question novelty based on real-time interaction metrics may mitigate cognitive overload. Integrating affective feedback analysis and dynamic prompt adaptation can further refine educational efficacy. Additionally, investigating long-term retention and transfer effects in real-world settings is warranted.
Conclusion
LLMs, when deployed as simulated students in interactive Learning by Teaching frameworks, significantly enhance vocabulary acquisition and memory retention. The findings highlight both the potential for personalized, scalable educational interventions and the necessity of adaptive system design to accommodate heterogeneous learner behaviors. The approach provides a robust foundation for future digital education systems leveraging LLMs for dynamic, explainable teaching interactions.
(2604.17893)