Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On support vector machines under a multiple-cost scenario (2312.14795v1)

Published 22 Dec 2023 in stat.ML and cs.LG

Abstract: Support Vector Machine (SVM) is a powerful tool in binary classification, known to attain excellent misclassification rates. On the other hand, many realworld classification problems, such as those found in medical diagnosis, churn or fraud prediction, involve misclassification costs which may be different in the different classes. However, it may be hard for the user to provide precise values for such misclassification costs, whereas it may be much easier to identify acceptable misclassification rates values. In this paper we propose a novel SVM model in which misclassification costs are considered by incorporating performance constraints in the problem formulation. Specifically, our aim is to seek the hyperplane with maximal margin yielding misclassification rates below given threshold values. Such maximal margin hyperplane is obtained by solving a quadratic convex problem with linear constraints and integer variables. The reported numerical experience shows that our model gives the user control on the misclassification rates in one class (possibly at the expense of an increase in misclassification rates for the other class) and is feasible in terms of running times.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Soft Computing 13(3), 307–318 (2009)
  2. Journal of Machine Learning Research 1, 113–141 (2001)
  3. Computers & Operations Research (2018). doi:https://doi.org/10.1016/j.cor.2018.03.005
  4. The Annals of Statistics 44(2), 813–852 (2016)
  5. Dynamic Ideas Belmont (2005)
  6. Critical Care 8(6), 508–512 (2004)
  7. Discrete Optimization 5(2), 186 – 204 (2008). In Memory of George B. Dantzig
  8. In: Proceedings of the 10th European Conference on Machine Learning, ECML ’98, pp. 131–136. Springer (1998)
  9. Surveys in Operations Research and Management Science 17(2), 97 – 106 (2012)
  10. Burges, C.J.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)
  11. Interfaces 20(5), 61–66 (1990)
  12. Discrete Applied Mathematics 156(6), 950–966 (2008)
  13. Computers & Operations Research 40(1), 150–165 (2013)
  14. Machine Learning 20(3), 273–297 (1995)
  15. Cambridge University Press, New York, NY, USA (2000)
  16. Neural Networks 70, 39–52 (2015)
  17. In: Data Warehousing and Knowledge Discovery: 9th International Conference, DaWaK 2007, Regensburg Germany, September 3-7, 2007. Proceedings, pp. 303–312. Springer Berlin Heidelberg, Berlin, Heidelberg (2007)
  18. Guo, J.: Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis. Biostatistics 11(4), 599–608 (2010)
  19. Gurobi Optimization, I.: Gurobi Optimizer Reference Manual (2016). URL http://www.gurobi.com
  20. Springer Series in Statistics. Springer New York Inc., New York, NY, USA (2001)
  21. John Wiley & Sons, Inc. (2013)
  22. Hoeffding, W.: Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association 58(301), 13–30 (1963)
  23. Advances in Data Analysis and Classification (2016)
  24. Tech. rep., Department of Computer Science, National Taiwan University (2003)
  25. In: Ijcai, vol. 14, pp. 1137–1145. Stanford, CA (1995)
  26. Lichman, M.: UCI Machine Learning Repository (2013). URL https://archive.ics.uci.edu/ml/index.php
  27. Machine Learning 46(1-3), 191–202 (2002)
  28. Biostatistics (2017)
  29. European Journal of Operational Research 261(2), 656 – 665 (2017)
  30. Journal of Chemical Information and Modeling 53(4), 867–878 (2013)
  31. Philosophical transactions of the royal society of London. Series A 209, 415–446 (1909)
  32. Knowledge and Information Systems 45(1), 247–270 (2015)
  33. Biostatistics 17(4), 722 (2016)
  34. Silva, A.P.D.: Optimization approaches to Supervised Classification. European Journal of Operational Research 261(2), 772–788 (2017)
  35. Statistics and Computing 14(3), 199–222 (2004)
  36. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(1), 281–288 (2009)
  37. Network Theory Ltd. (2011)
  38. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc., New York, NY, USA (1995)
  39. Wiley New York, 1 ed. (1998)
  40. Statistics and Computing 24(5), 885–905 (2014)
Citations (11)

Summary

We haven't generated a summary for this paper yet.