Sample Complexity of Supervised Fine-Tuning for Pedagogical Generalization

Determine the number of high-quality pedagogical demonstration examples required for supervised fine-tuning of large language models used as tutors to generalize across the full range of targeted pedagogical behaviours.

Background

LearnLM-Tutor is trained via supervised fine-tuning on curated pedagogical datasets. The authors note that achieving broad pedagogical generalization depends on the quantity and quality of demonstrations, but the required scale is not established.

Knowing the sample complexity would guide data collection strategies, resource allocation, and methodological choices (e.g., whether to complement SFT with RLHF) in developing pedagogically capable AI tutors.

References

It is unknown how many such examples are required to cover a full range of pedagogical behaviours such that a model fine-tuned on them can generalise well, and manual data collection of this type is expensive.

— Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach (2407.12687 - Jurenka et al., 21 May 2024) in Discussion

Sample Complexity of Supervised Fine-Tuning for Pedagogical Generalization

Sponsor

Background

References

Related Problems