Documenting end-to-end reproducibility protocols for prior LLM-based query generation studies
Identify and document the precise end-to-end experimental procedures—including prompt issuance, output handling, query extraction, dataset preparation, and baseline selection—required to fully reproduce the experiments of Wang et al. (2023) and Alaniz et al. (2023) on LLM-based Boolean query generation.
References
While extending the setups by\citet{wang2023chatgpt} and \citet{alaniz2023utility}, we ran into several issues and were unable to fully reproduce the publications, as not enough information was given by the authors.
— A Reproducibility and Generalizability Study of Large Language Models for Query Generation
(2411.14914 - Staudinger et al., 22 Nov 2024) in Section 5 Discussion