Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Degrees-of-freedom penalized piecewise regression (2312.16512v1)

Published 27 Dec 2023 in stat.ME, cs.NA, and math.NA

Abstract: Many popular piecewise regression models rely on minimizing a cost function on the model fit with a linear penalty on the number of segments. However, this penalty does not take into account varying complexities of the model functions on the segments potentially leading to overfitting when models with varying complexities, such as polynomials of different degrees, are used. In this work, we enhance on this approach by instead using a penalty on the sum of the degrees of freedom over all segments, called degrees-of-freedom penalized piecewise regression (DofPPR). We show that the solutions of the resulting minimization problem are unique for almost all input data in a least squares setting. We develop a fast algorithm which does not only compute a minimizer but also determines an optimal hyperparameter -- in the sense of rolling cross validation with the one standard error rule -- exactly. This eliminates manual hyperparameter selection. Our method supports optional user parameters for incorporating domain knowledge. We provide an open-source Python/Rust code for the piecewise polynomial least squares case which can be extended to further models. We demonstrate the practical utility through a simulation study and by applications to real data. A constrained variant of the proposed method gives state-of-the-art results in the Turing benchmark for unsupervised changepoint detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. “Segmentation of the mean of heteroscedastic data via cross-validation” In Statistics and Computing 21, 2010, pp. 613–632 DOI: 10.1007/s11222-010-9196-x
  2. “Segmentation of the mean of heteroscedastic data via cross-validation” In Statistics and Computing 21.4 Springer, 2011, pp. 613–632
  3. Ivan E Auger and Charles E Lawrence “Algorithms for the optimal identification of segment neighborhoods” In Bulletin of Mathematical Biology 51.1 Elsevier, 1989, pp. 39–54
  4. “Adaptive online estimation of piecewise polynomial trends” In Advances in Neural Information Processing Systems 33, 2020, pp. 20462–20472
  5. Rafal Baranowski, Yining Chen and Piotr Fryzlewicz “Narrowest-over-threshold detection of multiple change points and change-point-like features” In Journal of the Royal Statistical Society. Series B: Statistical Methodology 81.3, 2019, pp. 649–672
  6. “Curve fitting by segmented straight lines” In Journal of the American Statistical Association 64.327 Taylor & Francis, 1969, pp. 1079–1084
  7. “Visual reconstruction” MIT Press Cambridge, 1987
  8. Andrew Blake “Comparison of the efficiency of deterministic and stochastic algorithms for visual reconstruction” In IEEE Transactions on Pattern Analysis and Machine Intelligence 11.1 IEEE, 1989, pp. 2–12
  9. “Consistencies and rates of convergence of jump-penalized least squares estimators” In The Annals of Statistics 37.1 Institute of Mathematical Statistics, 2009, pp. 157–183
  10. James Bruce “Optimum quantization” MIT Research Laboratory of Electronics, 1965
  11. Gerrit J.J. Burg and Christopher K.I. Williams “An Evaluation of Change Point Detection Algorithms” arXiv, 2020 DOI: 10.48550/ARXIV.2003.06222
  12. “The One Standard Error Rule for Model Selection: Does It Work?” In Stats 4.4, 2021, pp. 868–892 DOI: 10.3390/stats4040051
  13. Yoni Donner and Joseph L. Hardy “Piecewise power laws in individual learning curves” In Psychonomic Bulletin & Review 22.5, 2015, pp. 1308–1319 DOI: 10.3758/s13423-015-0811-x
  14. Paul Fearnhead “Exact and efficient Bayesian inference for multiple changepoint problems.” In Statistics and computing 16.2, 2006, pp. 203–213
  15. Paul Fearnhead, Robert Maidstone and Adam Letchford “Detecting changes in slope with an l0subscript𝑙0l_{0}italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT penalty” In Journal of Computational and Graphical Statistics 28.2 Taylor & Francis, 2019, pp. 265–275
  16. “Bayesian Selection for the ℓ2subscriptℓ2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-Potts Model Regularization Parameter: 1-D Piecewise Constant Signal Denoising” In IEEE Transactions on Signal Processing 65.19 IEEE, 2017, pp. 5215–5224
  17. K. Frick, A. Munk and H. Sieling “Multiscale change point inference” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76.3 Wiley Online Library, 2014, pp. 495–580
  18. Jerome Friedman, Trevor Hastie and Rob Tibshirani “Regularization paths for generalized linear models via coordinate descent” In Journal of statistical software 33.1 NIH Public Access, 2010, pp. 1
  19. “Complexity Penalized M-estimation” In Journal of Computational and Graphical Statistics 17.1, 2008, pp. 201–224
  20. Piotr Fryzlewicz “WILD BINARY SEGMENTATION FOR MULTIPLE CHANGE-POINT DETECTION” In The Annals of Statistics 42.6, 2014, pp. 2243–2281
  21. Trevor Hastie, Robert Tibshirani and Jerome Friedman “The Elements of Statistical Learning”, Springer Series in Statistics Springer, 2009
  22. Kaylea Haynes, Idris A. Eckley and Paul Fearnhead “Computationally Efficient Changepoint Detection for a Range of Penalties” In Journal of Computational and Graphical Statistics 26.1 Taylor & Francis, 2017, pp. 134–143 DOI: 10.1080/10618600.2015.1116445
  23. Kaylea Haynes, Paul Fearnhead and Idris A Eckley “A computationally efficient nonparametric approach for changepoint detection” In Statistics and computing 27 Springer, 2017, pp. 1293–1305
  24. K Hohm, M Storath and A Weinmann “An algorithmic framework for Mumford-Shah regularization of inverse problems in imaging”, 2015
  25. “Idealizing ion channel recordings by a jump segmentation multiresolution filter” In IEEE Transactions on NanoBioscience 12.4 IEEE, 2013, pp. 376–386
  26. MF Hutchinson “Algorithm 642: A fast procedure for calculating minimum cross-validation cubic smoothing splines” In ACM Transactions on Mathematical Software (TOMS) 12.2 ACM New York, NY, USA, 1986, pp. 150–153
  27. “An algorithm for optimal partitioning of data on an interval” In IEEE Signal Processing Letters 12.2 IEEE, 2005, pp. 105–108
  28. Nicholas A Johnson “A dynamic programming algorithm for the fused lasso and l0-segmentation” In Journal of Computational and Graphical Statistics 22.2 Taylor & Francis, 2013, pp. 246–260
  29. “Advances in single-molecule fluorescence methods for molecular biology” In Annu. Rev. Biochem. 77 Annual Reviews, 2008, pp. 51–76
  30. “Energy Management for the Electric Powernet in Vehicles With a Conventional Drivetrain” In Control Systems Technology, IEEE Transactions on 15, 2007, pp. 494–505 DOI: 10.1109/TCST.2007.894646
  31. R. Killick, P. Fearnhead and I. Eckley “Optimal detection of changepoints with a linear computational cost” In Journal of the American Statistical Association 107.500 Taylor & Francis, 2012, pp. 1590–1598
  32. “Seeded binary segmentation: a general methodology for fast and optimal changepoint detection” In Biometrika 110.1 Oxford University Press, 2023, pp. 249–256
  33. Max A Little and Nick S Jones “Generalized methods and solvers for noise removal from piecewise constant signals. I. Background theory” In Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science 467.2135 The Royal Society, 2011, pp. 3088–3114
  34. Max A Little and Nick S Jones “Generalized methods and solvers for noise removal from piecewise constant signals. II. New methods” In Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science 467.2135 The Royal Society, 2011, pp. 3115–3140
  35. “AutoStepfinder: A fast and automated step detection method for single-molecule analysis” In Patterns 2.5 Elsevier, 2021, pp. 100256
  36. Malte Londschien, Peter Bühlmann and Solt Kovács “Random forests for change point detection” In Journal of Machine Learning Research 24.216, 2023, pp. 1–45
  37. “Historical greenhouse gas concentrations for climate modelling (CMIP6)” In Geoscientific Model Development 10 Copernicus, 2017, pp. 2057–2116
  38. “Boundary detection by minimizing functionals” In IEEE Conference on Computer Vision and Pattern Recognition 17, 1985, pp. 137–154
  39. “Optimal approximations by piecewise smooth functions and associated variational problems” In Communications on Pure and Applied Mathematics 42.5 Wiley Online Library, 1989, pp. 577–685
  40. “Catch bond drives stator mechanosensitivity in the bacterial flagellar motor” In Proceedings of the National Academy of Sciences 114.49 National Acad Sciences, 2017, pp. 12952–12957
  41. Ewan S Page “Continuous inspection schemes” In Biometrika 41.1/2 JSTOR, 1954, pp. 100–115
  42. Florian Pein and Rajen D Shah “Cross-validation for change-point regression: pitfalls and solutions” In arXiv preprint arXiv:2112.03220, 2021
  43. “Fast online changepoint detection via functional pruning CUSUM statistics” In Journal of Machine Learning Research 24, 2023, pp. 1–36
  44. “A comparison of single and multiple changepoint techniques for time series data” In Computational Statistics & Data Analysis 170 Elsevier, 2022, pp. 107433
  45. “Direct observation of steps in rotation of the bacterial flagellar motor” In Nature 437.7060 Nature Publishing Group, 2005, pp. 916–919
  46. M. Storath, A Weinmann and L. Demaret “Jump-Sparse and Sparse Recovery Using Potts Functionals” In IEEE Transactions on Signal Processing 62.14, 2014, pp. 3654–3666 DOI: 10.1109/TSP.2014.2329263
  47. “Fast Partitioning of Vector-Valued Images” In SIAM Journal on Imaging Sciences 7.3, 2014, pp. 1826–1852
  48. Martin Storath, Lukas Kiefer and Andreas Weinmann “Smoothing for signals with discontinuities using higher order Mumford-Shah models” In Numerische Mathematik 143.2 Springer, 2019, pp. 423–460
  49. “Smoothing splines for discontinuous signals” In Journal of Computational and Graphical Statistics Taylor & Francis, 2023, pp. 1–26
  50. Martin Storath, Andreas Weinmann and Michael Unser “Jump-penalized least absolute values estimation of scalar or circle-valued signals” In Information and Inference: A Journal of the IMA 6.3 Oxford University Press, 2017, pp. 225–245
  51. Charles Truong, Laurent Oudre and Nicolas Vayatis “Selective review of offline change point detection methods” Associated to the ruptures Python package In Signal Processing 167, 2020, pp. 107299 DOI: https://doi.org/10.1016/j.sigpro.2019.107299
  52. “Segmented regression analysis of interrupted time series studies in medication use research” In Journal of clinical pharmacy and therapeutics 27.4 Wiley Online Library, 2002, pp. 299–309
  53. Andreas Weinmann, Laurent Demaret and Martin Storath “Mumford-Shah and Potts regularization for manifold-valued data” In Journal of Mathematical Imaging and Vision 55.3 Springer, 2016, pp. 428–445
  54. “Iterative Potts and Blake-Zisserman minimization for the recovery of functions with discontinuities from indirect measurements” In Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 471.2176 The Royal Society Publishing, 2015, pp. 20140638
  55. Andreas Weinmann, Martin Storath and Laurent Demaret “The L1superscript𝐿1{L}^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT-Potts Functional for Robust Jump-Sparse Reconstruction” In SIAM Journal on Numerical Analysis 53.1 SIAM, 2015, pp. 644–673
  56. “Smoothers for discontinuous signals” In Journal of Nonparametric Statistics 14.1-2 Taylor & Francis, 2002, pp. 203–222
  57. “Don’t shed tears over breaks” In Jahresbericht DMV 107, 2005, pp. 57–87
  58. “Complexity penalized least squares estimators: Analytical results” In Mathematische Nachrichten 281.4 Wiley Online Library, 2008, pp. 582–595
  59. David W. Wolfson, David E. Andersen and John R. Fieberg “Using piecewise regression to identify biological phenomena in biotelemetry datasets” In J Anim Ecol 91.9, 2022, pp. 1755–1769 DOI: 10.1111/1365-2656.13779
  60. Yi-Ching Yao “Estimating the number of change-points via Schwarz’ criterion” In Statistics & Probability Letters 6.3 Elsevier, 1988, pp. 181–189
  61. Yi Yu, Sabyasachi Chatterjee and Haotian Xu “Localising change points in piecewise polynomials of general degrees” In Electronic Journal of Statistics 16.1 The Institute of Mathematical Statisticsthe Bernoulli Society, 2022, pp. 1855–1890
  62. Nancy R Zhang and David O Siegmund “A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data” In Biometrics 63.1 Wiley Online Library, 2007, pp. 22–32
  63. “Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm” In Remote Sensing of Environment 232 Elsevier, 2019, pp. 111181
  64. Chao Zheng, Idris Eckley and Paul Fearnhead “Consistency of a range of penalised cost approaches for detecting multiple changepoints” In Electronic Journal of Statistics 16.2 The Institute of Mathematical Statisticsthe Bernoulli Society, 2022, pp. 4497–4546
  65. “Nonparametric maximum likelihood approach to multiple change-point problems” In The Annals of Statistics 42.3 Institute of Mathematical Statistics, 2014, pp. 970–1002 DOI: 10.1214/14-AOS1210
  66. Martin Storath, Lukas Kiefer and Andreas Weinmann “Smoothing for signals with discontinuities using higher order Mumford–Shah models” In Numerische Mathematik 143, 2019, pp. 423–460 DOI: 10.1007/s00211-019-01052-8

Summary

We haven't generated a summary for this paper yet.