Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cost-Effective Retraining of Machine Learning Models (2310.04216v1)

Published 6 Oct 2023 in cs.LG

Abstract: It is important to retrain a ML model in order to maintain its performance as the data changes over time. However, this can be costly as it usually requires processing the entire dataset again. This creates a trade-off between retraining too frequently, which leads to unnecessary computing costs, and not retraining often enough, which results in stale and inaccurate ML models. To address this challenge, we propose ML systems that make automated and cost-effective decisions about when to retrain an ML model. We aim to optimize the trade-off by considering the costs associated with each decision. Our research focuses on determining whether to retrain or keep an existing ML model based on various factors, including the data, the model, and the predictive queries answered by the model. Our main contribution is a Cost-Aware Retraining Algorithm called Cara, which optimizes the trade-off over streams of data and queries. To evaluate the performance of Cara, we analyzed synthetic datasets and demonstrated that Cara can adapt to different data drifts and retraining costs while performing similarly to an optimal retrospective algorithm. We also conducted experiments with real-world datasets and showed that Cara achieves better accuracy than drift detection baselines while making fewer retraining decisions, ultimately resulting in lower total costs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. A survey on concept drift adaptation, ACM Computing Surveys 46 (2014) 44:1–44:37. doi:10.1145/2523813.
  2. Learning with Drift Detection, in: A. L. C. Bazzan, S. Labidi (Eds.), Advances in Artificial Intelligence – SBIA 2004, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2004, pp. 286–295. doi:10.1007/978-3-540-28645-5{_}29.
  3. A. Bifet, R. Gavaldà, Learning from Time-Changing Data with Adaptive Windowing, in: Proceedings of the 2007 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, 2007, pp. 443–448. doi:10.1137/1.9781611972771.42.
  4. Reactive soft prototype computing for concept drift streams, Neurocomputing 416 (2020) 340–351. URL: https://www.sciencedirect.com/science/article/pii/S0925231220305063. doi:https://doi.org/10.1016/j.neucom.2019.11.111.
  5. An adaptive framework for multistream classification, in: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM ’16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 1181–1190. URL: https://doi.org/10.1145/2983323.2983842. doi:10.1145/2983323.2983842.
  6. Energy and policy considerations for modern deep learning research, Proceedings of the AAAI Conference on Artificial Intelligence 34 (2020) 13693–13696. URL: https://ojs.aaai.org/index.php/AAAI/article/view/7123. doi:10.1609/aaai.v34i09.7123.
  7. E. S. Page, Continuous inspection schemes, Biometrika 41 (1954) 100–115. URL: http://www.jstor.org/stable/2333009.
  8. Online and non-parametric drift detection methods based on hoeffding’s bounds, IEEE Transactions on Knowledge and Data Engineering 27 (2015) 810–823. doi:10.1109/TKDE.2014.2345382.
  9. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection, Information Sciences 608 (2022) 996–1009. doi:10.1016/j.ins.2022.07.022.
  10. A Framework for Classification in Data Streams Using Multi-strategy Learning, in: T. Calders, M. Ceci, D. Malerba (Eds.), Discovery Science, Lecture Notes in Computer Science, Springer International Publishing, Cham, 2016, pp. 341–355. doi:10.1007/978-3-319-46307-0{_}22.
  11. Matchmaker: Data drift mitigation in machine learning for large-scale systems, in: D. Marculescu, Y. Chi, C. Wu (Eds.), Proceedings of Machine Learning and Systems, volume 4, 2022, pp. 77–94.
  12. Towards cost-sensitive adaptation: When is it worth updating your predictive model?, Neurocomputing 150 (2015) 240–249. doi:10.1016/j.neucom.2014.05.084.
  13. KL-ADWIN: Enhanced concept drift detection over multiple time windows, in: Central European Conference on Information and Intelligent Systems, Faculty of Organization and Informatics Varazdin, Varazdin, Croatia, 2022, pp. 49–54.
  14. E. S. PAGE, CONTINUOUS INSPECTION SCHEMES, Biometrika 41 (1954) 100–115. URL: https://doi.org/10.1093/biomet/41.1-2.100. doi:10.1093/biomet/41.1-2.100.
  15. Continuous training for production ML in the TensorFlow extended (TFX) platform, in: 2019 USENIX Conference on Operational Machine Learning (OpML 19), USENIX Association, Santa Clara, CA, 2019, pp. 51–53. URL: https://www.usenix.org/conference/opml19/presentation/baylor.
  16. J. A. Blackard, D. J. Dean, Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables, Computers and Electronics in Agriculture 24 (1999) 131–151. URL: https://www.sciencedirect.com/science/article/pii/S0168169999000460. doi:https://doi.org/10.1016/S0168-1699(99)00046-0.
  17. C.-C. Chang, C.-J. Lin, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology 2 (2011) 27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
  18. Adaptive random forests for evolving data stream classification, Machine Learning 106 (2017) 1469–1495. doi:10.1007/s10994-017-5642-8.
  19. Scikit-multiflow: A multi-output streaming framework, Journal of Machine Learning Research 19 (2018) 1–5. URL: http://jmlr.org/papers/v19/18-251.html.
  20. Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830.
  21. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python, Nature Methods 17 (2020) 261–272. doi:10.1038/s41592-019-0686-2.

Summary

We haven't generated a summary for this paper yet.