Source and Integration of Human Preferences for RLHF in Educational Tutoring
Ascertain whether human preference signals for training reward models in reinforcement learning from human feedback (RLHF) for educational tutors should be elicited from learners, educators, or both, and establish principled methods for combining learner and educator preferences when both are used.
References
It is also not clear whether the preferences should be elicited from the learners, educators or both, and how they should be combined if it is the latter.
— Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach
(2407.12687 - Jurenka et al., 21 May 2024) in Section: Challenges with eliciting human preferences for pedagogy (Appendix)