Cause of few-shot degradation with APE-generated instructions
Determine whether the observed performance degradation when adding in-context examples for the Rhymes, Large Animal, and Second Letters tasks after prepending instructions generated by Automatic Prompt Engineer (APE) is caused by overfitting of the APE-selected instructions to the zero-shot setting, which would explain their poor performance in the few-shot case.
Sponsor
References
Counter-intuitively, adding in-context examples for Rhymes, Large Animal, and Second Letters hurts model performance. We conjecture that it may be because the selected instructions overfit the zero-shot learning scenario and thus do not perform well on the few-shot case.
— Large Language Models Are Human-Level Prompt Engineers
(2211.01910 - Zhou et al., 2022) in Instruction Induction, Few-shot In-context Learning