Cost-effectiveness of OpenAI’s o1-preview for planning tasks
Determine whether employing OpenAI’s o1-preview Large Reasoning Model for planning problems is cost-effective under its reasoning-token pricing scheme compared to alternative approaches such as classical planners and LLM-Modulo systems.
References
While o1-preview may provide higher accuracy than LLMs, it still fails to provide any correctness guarantees, and it is unclear that it is at all cost-effective.
                — LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench
                
                (2409.13373 - Valmeekam et al., 20 Sep 2024) in Section 3, Accuracy/Cost Tradeoffs and Guarantees