Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 154 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 21 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 170 tok/s Pro
GPT OSS 120B 411 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Evaluating classification performance across operating contexts: A comparison of decision curve analysis and cost curves (2509.24608v1)

Published 29 Sep 2025 in cs.LG and stat.ML

Abstract: Classification models typically predict a score and use a decision threshold to produce a classification. Appropriate model evaluation should carefully consider the context in which a model will be used, including the relative value of correct classifications of positive versus negative examples, which affects the threshold that should be used. Decision curve analysis (DCA) and cost curves are model evaluation approaches that assess the expected utility and expected loss of prediction models, respectively, across decision thresholds. We compared DCA and cost curves to determine how they are related, and their strengths and limitations. We demonstrate that decision curves are closely related to a specific type of cost curve called a Brier curve. Both curves are derived assuming model scores are calibrated and setting the classification threshold using the relative value of correct positive and negative classifications, and the x-axis of both curves are equivalent. Net benefit (used for DCA) and Brier loss (used for Brier curves) will always choose the same model as optimal at any given threshold. Across thresholds, differences in Brier loss are comparable whereas differences in net benefit cannot be compared. Brier curves are more generally applicable (when a wider range of thresholds are plausible), and the area under the Brier curve is the Brier score. We demonstrate that reference lines common in each space can be included in either and suggest the upper envelope decision curve as a useful comparison for DCA showing the possible gain in net benefit that could be achieved through recalibration alone.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.