Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
51 tokens/sec
2000 character limit reached

Monitoring Machine Learning Forecasts for Platform Data Streams (2401.09144v1)

Published 17 Jan 2024 in stat.AP and stat.ML

Abstract: Data stream forecasts are essential inputs for decision making at digital platforms. Machine learning algorithms are appealing candidates to produce such forecasts. Yet, digital platforms require a large-scale forecast framework that can flexibly respond to sudden performance drops. Re-training ML algorithms at the same speed as new data batches enter is usually computationally too costly. On the other hand, infrequent re-training requires specifying the re-training frequency and typically comes with a severe cost of forecast deterioration. To ensure accurate and stable forecasts, we propose a simple data-driven monitoring procedure to answer the question when the ML algorithm should be re-trained. Instead of investigating instability of the data streams, we test if the incoming streaming forecast loss batch differs from a well-defined reference batch. Using a novel dataset constituting 15-min frequency data streams from an on-demand logistics platform operating in London, we apply the monitoring procedure to popular ML algorithms including random forest, XGBoost and lasso. We show that monitor-based re-training produces accurate forecasts compared to viable benchmarks while preserving computational feasibility. Moreover, the choice of monitoring procedure is more important than the choice of ML algorithm, thereby permitting practitioners to combine the proposed monitoring procedure with one's favorite forecasting algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. Unsupervised real-time anomaly detection for streaming data. Neurocomputing, 262:134–147.
  2. Can a machine correct option pricing models? Journal of Business & Economic Statistics, 41(3):995–1009.
  3. Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4):821–856.
  4. Online non-parametric changepoint detection with application to monitoring operational performance of network devices. Computational Statistics & Data Analysis, 177:107551.
  5. Estimating and testing linear models with multiple structural changes. Econometrica, 66(1):47–78.
  6. Adaboost is consistent. Journal of Machine Learning Research, 8:2347–2368.
  7. Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting, 37(2):587–603.
  8. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32.
  9. Bühlmann, P. (2006). Boosting for high-dimensional linear models. The Annals of Statistics, 34(2):559 – 583.
  10. Can you gig it? An empirical examination of the gig economy and entrepreneurial activity. Management Science, 64(12):5497–5520.
  11. Nonparametric density estimation for streaming data. Manuscript.
  12. Chen, H. (2019). Sequential change-point detection based on nearest neighbors. The Annals of Statistics, 47(3):1381 – 1407.
  13. The value of flexible work: Evidence from uber drivers. Journal of Political Economy, 127(6):2735–2794.
  14. Xgboost: A scalable tree boosting system. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 0(0):785–794.
  15. xgboost: Extreme gradient boosting. R package version 1.7.5.1.
  16. High-dimensional, multiscale online changepoint detection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 84(1):234–266.
  17. Monitoring structural change. Econometrica, 64(5):1045–65.
  18. How is machine learning useful for macroeconomic forecasting? Journal of Applied Econometrics, 37(5):920–964.
  19. A likelihood ratio approach to sequential change point detection for a general class of parameters. Journal of the American Statistical Association, 115(531):1361–1377.
  20. Efron, B. (2020). Prediction, estimation, and attribution. Journal of the American Statistical Association, 115(530):636–655.
  21. Computer Age Statistical Inference. Cambridge University Press, Cambridge.
  22. An information-theoretic approach to distribution shifts. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 17628–17641. Curran Associates, Inc.
  23. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1–22.
  24. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189–1232.
  25. Structure–adaptive sequential testing for online false discovery rate control. Journal of the American Statistical Association, 118(541):732–745.
  26. Driver surge pricing. Management Science, 68(5):3219–3235.
  27. Detecting and predicting forecast breakdowns. The Review of Economic Studies, 76(2):669–705.
  28. Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50(4):1029–1054.
  29. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.
  30. Stochastic tree ensembles for regularized nonlinear regression. Journal of the American Statistical Association, 118(541):551–570.
  31. Dynamic and integrative capabilities for profiting from innovation in digital platform-based ecosystems. Research Policy, 47(8):1391–1399.
  32. Fast forecasting of unstable data streams for on-demand service platforms. arXiv preprint arXiv:2303.01887.
  33. Sequential modeling, monitoring, and forecasting of streaming web traffic data. The Annals of Applied Statistics, 16(1):300–325.
  34. Forecasting with trees. International Journal of Forecasting, 38(4):1473–1481.
  35. Nonparametric two-sample tests of high dimensional mean vectors via random integration. Journal of the American Statistical Association, pages 1–14.
  36. changepoint: An R package for changepoint analysis. Journal of Statistical Software, 58(3):1–19.
  37. Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107(500):1590–1598.
  38. treeshap: Compute SHAP values for your tree-based models using the ‘TreeSHAP’ algorithm. R package version 0.2.5.
  39. Classification and regression by randomforest. R News, 2(3):18–22.
  40. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888.
  41. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
  42. Renewable estimation and incremental inference in generalized linear models with streaming data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82(1):69–97.
  43. Real-time regression analysis of streaming clustered data with possible abnormal data batches. Journal of the American Statistical Association, pages 1–16.
  44. The M4 competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(1):54–74.
  45. The M5 competition: Background, organization, and implementation. International Journal of Forecasting, 38(4):1325–1336.
  46. Machine learning advances for time series forecasting. Journal of Economic Surveys, 37(1):76–111.
  47. Forecasting inflation in a data-rich environment: The benefits of machine learning methods. Journal of Business & Economic Statistics, 39(1):98–119.
  48. Mei, Y. (2010). Efficient scalable schemes for monitoring a large number of data streams. Biometrika, 97(2):419–433.
  49. Molnar, C. (2020). Interpretable machine learning: A guide for making black box models explainable. Creative Common License.
  50. Sequential nonparametric tests for a change in distribution: An application to detecting radiological anomalies. Journal of the American Statistical Association, 114(526):514–528.
  51. Page, E. S. (1955). A test for a change in a parameter occurring at an unknown point. Biometrika, 42(3-4):523–527.
  52. Testing for changes in forecasting performance. Journal of Business & Economic Statistics, 39(1):148–165.
  53. Forecasting time series subject to multiple structural breaks. The Review of Economic Studies, 73(4):1057–1084.
  54. Qiu, P. (2020). Big data? Statistical process control can help! The American Statistician, 74(4):329–344.
  55. R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  56. Failing loudly: An empirical study of methods for detecting dataset shift. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  57. Fast online changepoint detection via functional pruning CUSUM statistics. Journal of Machine Learning Research, 24(81):1–36.
  58. Nonparametric monitoring of data streams for changes in location and scale. Technometrics, 53(4):379–389.
  59. Rossi, B. (2021). Forecasting in the presence of instabilities: How do we know whether models predict well and how to improve them. Journal of Economic Literature, 59(4).
  60. Random forests for spatially dependent data. Journal of the American Statistical Association, 118(541):665–683.
  61. Consistency of random forests. The Annals of Statistics, 43(4):1716–1741.
  62. Shapley, L. S. et al. (1953). A value for n-person games. Contributions to the theory of games, 2:307–317.
  63. A survey on machine learning for recurring concept drifting data streams. Expert Systems with Applications, page 118934.
  64. Elastic net regularization paths for all generalized linear models. Journal of Statistical Software, 106(1):1–31.
  65. Taylor, T. A. (2018). On-demand service platforms. Manufacturing & Service Operations Management, 20(4):704–720.
  66. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1):267–288.
  67. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242.
  68. Poisson-focus: An efficient online method for detecting count bursts with application to gamma ray burst detection. Journal of the American Statistical Association, pages 1–13.
  69. Wilson, N. (2023). For most, working in the gig economy is not a stopgap measure. Forbes.
  70. Monitoring structural change in dynamic econometric models. Journal of Applied Econometrics, 20(1):99–121.
  71. Zgola, M. (2021). Will the gig economy become the new working-class norm? Forbes.
  72. A simple two-sample test in high dimensions based on L2-norm. Journal of the American Statistical Association, 115(530):1011–1027.
  73. Adaptive risk minimization: Learning to adapt to domain shift. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 23664–23678. Curran Associates, Inc.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com