Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Outlier detection in regression: conic quadratic formulations (2307.05975v1)

Published 12 Jul 2023 in math.OC, cs.LG, stat.ME, and stat.ML

Abstract: In many applications, when building linear regression models, it is important to account for the presence of outliers, i.e., corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders-of-magnitude faster than existing big-M formulations in the literature for this problem.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. J. Agulló. New algorithms for computing the least trimmed squares regression estimator. Computational Statistics & Data Analysis, 36(4):425–439, 2001.
  2. A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Operations Research Letters, 37(3):187–191, 2009.
  3. A. Albert. Conditions for positive and nonnegative definiteness in terms of pseudoinverses. SIAM Journal on Applied Mathematics, 17(2):434–440, 1969.
  4. A. Atamtürk and A. Gómez. Rank-one convexification for sparse regression. arXiv preprint arXiv:1901.10334, 2019.
  5. A. Atamturk and A. Gómez. Safe screening rules for ℓ0subscriptℓ0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT-regression from perspective relaxations. In International Conference on Machine Learning, pages 421–430. PMLR, 2020.
  6. Sparse and smooth signal estimation: convexification of L0-formulations. The Journal of Machine Learning Research, 22(1):2370–2412, 2021.
  7. W. Ben-Ameur and J. Neto. New bounds for subset selection from conic relaxations. European Journal of Operational Research, 298(2):425–438, 2022.
  8. T. Bernholt. Computing the least median of squares estimator in time o(nd). In O. Gervasi, M. L. Gavrilova, V. Kumar, A. Laganà, H. P. Lee, Y. Mun, D. Taniar, and C. J. K. Tan, editors, Computational Science and Its Applications – ICCSA 2005, pages 697–706, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg.
  9. T. Bernholt. Robust estimators are hard to compute. Technical report, Univ. Dortmund, 2005.
  10. Best subset selection via a modern optimization lens. The Annals of Statistics, 44(2):813–852, 2016.
  11. D. Bertsimas and R. Mazumder. Least quantile regression via modern optimization. Annals of Statistics, 42(6):2494–2525, 2014.
  12. D. Bertsimas and B. Van Parys. Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. The Annals of Statistics, 48(1):300–323, 2020.
  13. Robust regression via hard thresholding. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS’15, page 721–729, Cambridge, MA, USA, 2015. MIT Press.
  14. Learning surrogate models for simulation-based optimization. AIChE Journal, 60(6):2211–2227, 2014.
  15. A combined first-principles and data-driven approach to model building. Computers & Chemical Engineering, 73:116–127, 2015.
  16. Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. arXiv preprint arXiv:1510.06083, 2015.
  17. J. W. Dunn. Optimal trees for prediction and prescription. PhD thesis, Massachusetts Institute of Technology, 2018.
  18. L. El Ghaoui and H. Lebret. Robust solutions to least-squares problems with uncertain data. SIAM Journal on matrix analysis and applications, 18(4):1035–1064, 1997.
  19. A. Frangioni and C. Gentile. Perspective cuts for a class of convex 0–1 mixed integer programs. Mathematical Programming, 106:225–236, 2006.
  20. A. Giloni and M. Padberg. Least trimmed squares regression, least median squares regression, and mathematical programming. Mathematical and Computer Modelling, 35(9-10):1043–1060, 2002.
  21. A. Gómez. Outlier detection in time series via mixed-integer conic quadratic optimization. SIAM Journal on Optimization, 31(3):1897–1925, 2021.
  22. A. Gómez and O. A. Prokopyev. A mixed-integer fractional optimization approach to best subset selection. INFORMS Journal on Computing, 33(2):551–565, 2021.
  23. O. Günlük and J. Linderoth. Perspective reformulations of mixed integer nonlinear programs with indicator variables. Mathematical programming, 124:183–205, 2010.
  24. F. R. Hampel. A general qualitative definition of robustness. The Annals of Mathematical Statistics, 42(6):1887–1896, 1971.
  25. D. M. Hawkins. The feasible solution algorithm for least trimmed squares regression. Computational Statistics & Data Analysis, 17(2):185–196, 1994.
  26. H. Hazimeh and R. Mazumder. Fast best subset selection: Coordinate descent and local combinatorial optimization algorithms. Operations Research, 68(5):1517–1537, 2020.
  27. L0Learn: A scalable package for sparse learning using L0 regularization. arXiv preprint arXiv:2202.04820, 2022.
  28. Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Mathematical Programming, 196(1-2):347–388, 2022.
  29. P. Huber. Robust regression: Asymptotics, conjectures and monte carlo. Ann. Statist., 1:799–821, 1973.
  30. P. Huber. Robust Statistics. Springer, Berlin, 2011.
  31. Simultaneous feature selection and outlier detection with optimality guarantees. Biometrics, 78(4):1592–1603, 2022.
  32. K. Kimura and H. Waki. Minimization of Akaike’s information criterion in linear regression analysis via mixed integer nonlinear program. Optimization Methods and Software, 33(3):633–649, 2018.
  33. P-split formulations: A class of intermediate formulations between big-M and convex hull for disjunctive constraints. arXiv preprint arXiv:2202.05198, 2022.
  34. A graph-based decomposition method for convex quadratic optimization with indicators. Mathematical Programming, pages 1–33, 2022.
  35. Inverses of 2 × 2 block matrices. Computers & Mathematics with Applications, 43(1):119–129, 2002.
  36. Integer programming for learning directed acyclic graphs from continuous data. INFORMS Journal on Optimization, 3(1):46–73, 2021.
  37. Subset selection with shrinkage: Sparse linear modeling when the snr is low. Operations Research, 71(1):129–147, 2023.
  38. R. Mazumder and H. Wang. Linear regression with partially mismatched data: local search with theoretical guarantees. Mathematical Programming, 197(2):1265–1303, 2023.
  39. R. Miyashiro and Y. Takano. Mixed integer second-order cone programming formulations for variable selection in linear regression. European Journal of Operational Research, 247(3):721–731, 2015.
  40. On the least trimmed squares estimator. Algorithmica, 69, 2014.
  41. Y. Nesterov and A. Nemirovskii. Interior-Point Polynomial Algorithms in Convex Programming. Society for Industrial and Applied Mathematics, 1994.
  42. Y. W. Park and D. Klabjan. Subset selection for multiple linear regression via optimization. Journal of Global Optimization, 77(3):543–574, 2020.
  43. P. Rousseeuw. Least median of squares regression. Journal of the American Statistical Association, 79(388):871–880, 1984.
  44. Robustbase: basic robust statistics. R package version 0.4-5, URL http://CRAN. R-project. org/package= robustbase, 2009.
  45. P. Rousseeuw and A. Leroy. Robust regression and outlier detection, volume 589. John Wiley & Sons, 1987.
  46. P. Rousseeuw and K. Van Driessen. Computing LTS regression for large data sets. Data Min Knowl Disc, 12:29–45, 2006.
  47. P. Rousseeuw and V. Yohai. Robust regression by means of s-estimators. In Robust and Nonlinear Time Series Analysis: Proceedings of a Workshop Organized by the Sonderforschungsbereich 123 “Stochastische Mathematische Modelle”, Heidelberg 1983, pages 256–272. Springer, 1984.
  48. Y. Shen and S. Sanghavi. Iterative least trimmed squares for mixed linear regression. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  49. Y. Shen and S. Sanghavi. Learning with bad training data via iterative trimmed loss minimization. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 5739–5748. PMLR, 09–15 Jun 2019.
  50. J. Sherman and W. Morrison. Adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix. Annals of Mathematical Statistics, 20(4):620–624, 1949.
  51. J. Sherman and W. Morrison. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Annals of Mathematical Statistics, 21(1):124–127, 1950.
  52. On the convex hull of convex quadratic optimization problems with indicators. Forthcoming in Mathematical Programming, 2023.
  53. The alamo approach to machine learning. Computers & Chemical Engineering, 106:785–795, 2017.
  54. W. Xie and X. Deng. Scalable algorithms for the sparse ridge regression. SIAM Journal on Optimization, 30(4):3359–3386, 2020.
  55. Improving the performance of MIQP solvers for quadratic programs with cardinality and minimum threshold constraints: A semidefinite program approach. INFORMS Journal on Computing, 26(4):690–703, 2014.
  56. G. Zioutas and A. Avramidis. Deleting outliers in robust regression with mixed integer programming. Acta Mathematicae Applicatae Sinica, 21:323–334, 2005.
  57. Quadratic mixed integer programming and support vectors for deleting outliers in robust regression. Annals of Operations Research, 166(1):339–353, 2009.
Citations (3)

Summary

We haven't generated a summary for this paper yet.