Formulating a setup that jointly studies demonstrations and evaluative feedback
Formulate a learning setup that jointly models demonstration-based supervision and evaluative feedback (e.g., whether a generated response is good), enabling analysis of how overlap among consistent hypotheses and additional feedback interact, and characterization of the feedback requirements to guarantee performance.
References
However, we leave it open to formulate an interesting setup that enables a study of both types of feedbacks together for our problem, and we do not attempt to investigate this any further.
— Learning to Answer from Correct Demonstrations
(2510.15464 - Joshi et al., 17 Oct 2025) in Appendix A.3 (Overlap of MLE), end of subsection