Transfer of ensemble benefits to open-ended generation
Determine whether the improvements observed from consensus-seeking generalized-mean f-ensembles and better posterior approximations via sequential Monte Carlo on structured text generation tasks transfer to open-ended generation tasks such as creative writing and dialogue.
References
Whether these benefits transfer to open-ended generation tasks (e.g., creative writing, dialogue) remains an open question, as evaluation in such settings is harder to evaluate.
— Ensembling Language Models with Sequential Monte Carlo
(2603.05432 - Chan et al., 5 Mar 2026) in Limitations, Task selection