Effect of prompt engineering on accuracy, readability, and topic modeling of ChatGPT responses
Determine how adding contextual information or altering prompt wording (prompt engineering) affects the accuracy of answers, Flesch–Kincaid and SMOG readability indices, and Latent Dirichlet Allocation topic-model outputs for responses generated by ChatGPT versions GPT3.5, GPT4, and GPT4o-mini to graduate-level statistics exam questions.
References
In the current paper, we did not change any wording of the question in order to mimic how we thought students might use ChatGPT to get help on homework questions; therefore, we cannot comment on how providing context or changing question wording will affect our results on accuracy, reading level, or topic modeling. We reserve an investigation of the effect of “prompt engineering”, including different frameworks of prompt engineering, on text analytics from output for various generative AI platforms for future research.