Performance of Flagship and Open-Source LLMs for Prior Authorization Letter Generation

Determine the performance of flagship large language models and open-source large language models on the task of generating prior authorization request letters.

Background

The evaluation focused on three mid-tier commercial models because of their cost and accessibility for high-volume prior authorization workflows.

The authors explicitly note that how flagship proprietary models and open-source alternatives perform on this specific task has not yet been established, leaving their relative capabilities unresolved.

References

Finally, the study evaluates mid-tier models from three providers. Performance of flagship models and open-source alternatives remains an open question for this task.

AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding  (2603.29366 - Awan et al., 31 Mar 2026) in Limitations section