Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Distributionally Robust Optimisation Approach to Fair Credit Scoring (2402.01811v1)

Published 2 Feb 2024 in cs.LG and cs.CY

Abstract: Credit scoring has been catalogued by the European Commission and the Executive Office of the US President as a high-risk classification task, a key concern being the potential harms of making loan approval decisions based on models that would be biased against certain groups. To address this concern, recent credit scoring research has considered a range of fairness-enhancing techniques put forward by the machine learning community to reduce bias and unfair treatment in classification systems. While the definition of fairness or the approach they follow to impose it may vary, most of these techniques, however, disregard the robustness of the results. This can create situations where unfair treatment is effectively corrected in the training set, but when producing out-of-sample classifications, unfair treatment is incurred again. Instead, in this paper, we will investigate how to apply Distributionally Robust Optimisation (DRO) methods to credit scoring, thereby empirically evaluating how they perform in terms of fairness, ability to classify correctly, and the robustness of the solution against changes in the marginal proportions. In so doing, we find DRO methods to provide a substantial improvement in terms of fairness, with almost no loss in performance. These results thus indicate that DRO can improve fairness in credit scoring, provided that further advances are made in efficiently implementing these systems. In addition, our analysis suggests that many of the commonly used fairness metrics are unsuitable for a credit scoring setting, as they depend on the choice of classification threshold.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. The 14th pacific-asia conference on knowledge discovery and data mining. URL: https://pakdd.org/archive/pakdd2010/PAKDDCompetition.html.
  2. The law of equal opportunities or unintended consequences?: The effect of unisex risk assessment in consumer credit. Journal of the Royal Statistical Society: Series A (Statistics in Society) 182, 1287–1311.
  3. Home credit default risk. URL: https://kaggle.com/competitions/home-credit-default-risk.
  4. Fairness in machine learning. Nips tutorial 1, 2.
  5. Robust optimization. volume 28. Princeton university press.
  6. Robust convex optimization. Mathematics of operations research 23, 769–805.
  7. The price of robustness. Operations research 52, 35–53.
  8. Robust wasserstein profile inference and applications to machine learning. Journal of Applied Probability 56, 830–857.
  9. Building classifiers with independency constraints, in: 2009 IEEE International Conference on Data Mining Workshops, IEEE. pp. 13–18.
  10. Computationally efficient approximations for distributionally robust optimization under moment and wasserstein ambiguity. INFORMS Journal on Computing 34, 1768–1794.
  11. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 153–163.
  12. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810 .
  13. Give me some credit. URL: https://kaggle.com/competitions/GiveMeSomeCredit.
  14. Ground metric learning. The Journal of Machine Learning Research 15, 533–564.
  15. UCI machine learning repository. URL: http://archive.ics.uci.edu/ml.
  16. Fairness through awareness, in: Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226.
  17. Guidelines on data protection officers (’dpos’). URL: https://ec.europa.eu/newsroom/article29/items/612048/en.
  18. Wasserstein distributionally robust optimization and variation regularization. Operations Research .
  19. Credit expansion in emerging markets: propeller of growth? International Monetary Fund.
  20. Equality of opportunity in supervised learning, in: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (Eds.), Advances in Neural Information Processing Systems, Curran Associates, Inc.. p. 3323–3331. URL: https://proceedings.neurips.cc/paper_files/paper/2016/file/9d2682367c3935defcb1f9e247a97c0d-Paper.pdf.
  21. Equality of opportunity in supervised learning. Advances in neural information processing systems 29.
  22. Statlog (german credit data) data set. URL: https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data).
  23. default of credit card clients data set. URL: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients.
  24. Classifying without discriminating, in: 2009 2nd international conference on computer, control and communication, IEEE. pp. 1–6.
  25. Decision theory for discrimination-aware classification, in: 2012 IEEE 12th International Conference on Data Mining, IEEE. pp. 924–929.
  26. Fairness-aware learning through regularization approach, in: 2011 IEEE 11th International Conference on Data Mining Workshops, IEEE. pp. 643–650.
  27. Mathematical methods of organizing and planning production. Management science 6, 366–422.
  28. Fairness in credit scoring: Assessment, implementation and profit implications. European Journal of Operational Research 297, 1083–1094.
  29. Classification in presence of drift and latency, in: 2011 IEEE 11th International Conference on Data Mining Workshops, IEEE. pp. 596–603.
  30. Wasserstein distributionally robust optimization: Theory and applications in machine learning, in: Operations Research & Management Science in the Age of Analytics. INFORMS, pp. 130–166.
  31. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research 247, 124–136.
  32. A first-order algorithmic framework for distributionally robust logistic regression. Advances in Neural Information Processing Systems 32.
  33. Robust satisficing. Operations Research 71, 61–82.
  34. Fairness and missing values. arXiv preprint arXiv:1905.12728 .
  35. A benchmark of machine learning approaches for credit score prediction. Expert Systems with Applications 165, 113986.
  36. Machine learning: a probabilistic perspective. MIT press.
  37. Big data: A report on algorithmic systems, opportunity and civil rights. URL: https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/2016_0504_data_discrimination.pdf.
  38. Stochastic gradient descent and its variants in machine learning. Journal of the Indian Institute of Science 99, 201–213.
  39. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information processing systems 32.
  40. Algorithmic fairness. arXiv preprint arXiv:2001.09784 .
  41. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659 .
  42. Cross-validation. Encyclopedia of database systems 5, 532–538.
  43. The impact of regularization on high-dimensional logistic regression. arXiv preprint arXiv:1906.03761 .
  44. Disparate impact in big data policing. Ga. L. Rev. 52, 109.
  45. Wasserstein Distributionally Robust Learning. Technical Report. EPFL.
  46. Distributionally robust logistic regression. Advances in Neural Information Processing Systems 28.
  47. Convex programming with set-inclusive constraints and applications to inexact linear programming. Operations research 21, 1154–1157.
  48. Evaluating model robustness and stability to dataset shift, in: International Conference on Artificial Intelligence and Statistics, PMLR. pp. 2611–2619.
  49. A distributionally robust approach to fair classification. arXiv preprint arXiv:2007.09530 .
  50. Credit scoring and its applications. Society for Industrial and Applied Mathematics.
  51. A comprehensive survey on regularization strategies in machine learning. Information Fusion 80, 146–166.
  52. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology 58, 267–288.
  53. Robust optimization for fairness with noisy protected groups. Advances in Neural Information Processing Systems 33, 5190–5203.
  54. Wasserstein robust classification with fairness constraints. arXiv preprint arXiv:2103.06828 .
  55. Robust regression and lasso. IEEE Transactions on Information Theory 56, 3561–3574.
  56. An overview of overfitting and its solutions, in: Journal of physics: Conference series, IOP Publishing. p. 022022.
  57. Index for rating diagnostic tests. Cancer 3, 32–35.
  58. A comparison study of credit scoring models, in: Third International Conference on Natural Computation (ICNC 2007), IEEE. pp. 15–18.
  59. An overview of concept drift applications. Big data analysis: new algorithms for a new society , 91–114.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Pablo Casas (1 paper)
  2. Christophe Mues (6 papers)
  3. Huan Yu (63 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets