Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FAIRER: Fairness as Decision Rationale Alignment (2306.15299v1)

Published 27 Jun 2023 in cs.LG, cs.AI, and cs.CY

Abstract: Deep neural networks (DNNs) have made significant progress, but often suffer from fairness issues, as deep models typically show distinct accuracy differences among certain subgroups (e.g., males and females). Existing research addresses this critical issue by employing fairness-aware loss functions to constrain the last-layer outputs and directly regularize DNNs. Although the fairness of DNNs is improved, it is unclear how the trained network makes a fair prediction, which limits future fairness improvements. In this paper, we investigate fairness from the perspective of decision rationale and define the parameter parity score to characterize the fair decision process of networks by analyzing neuron influence in various subgroups. Extensive empirical studies show that the unfair issue could arise from the unaligned decision rationales of subgroups. Existing fairness regularization terms fail to achieve decision rationale alignment because they only constrain last-layer outputs while ignoring intermediate neuron alignment. To address the issue, we formulate the fairness as a new task, i.e., decision rationale alignment that requires DNNs' neurons to have consistent responses on subgroups at both intermediate processes and the final prediction. To make this idea practical during optimization, we relax the naive objective function and propose gradient-guided parity alignment, which encourages gradient-weighted consistency of neurons across subgroups. Extensive experiments on a variety of datasets show that our method can significantly enhance fairness while sustaining a high level of accuracy and outperforming other approaches by a wide margin.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. The history began from alexnet: A comprehensive survey on deep learning approaches. arXiv preprint arXiv:1803.01164, 2018.
  2. Bishop, C. M. Neural Networks for Pattern Recognition. Oxford University Press, Inc., USA, 1996. ISBN 0198538499.
  3. Chouldechova, A. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
  4. Fair mixup: Fairness via interpolation. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=DNl5s5BXeBn.
  5. Training well-generalizing classifiers for fairness metrics and other data-dependent constraints. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp.  1397–1405. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/cotter19b.html.
  6. New types of deep neural network learning for speech recognition and related applications: an overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.  8599–8603, 2013. doi: 10.1109/ICASSP.2013.6639344.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019.
  8. Techniques for interpretable machine learning. Communications of the ACM, 63(1):68–77, 2019.
  9. Fairness via representation neutralization. In NeurIPS, 2021.
  10. UCI machine learning repository, 2017a. URL http://archive.ics.uci.edu/ml.
  11. UCI machine learning repository, 2017b. URL http://archive.ics.uci.edu/ml.
  12. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, pp.  259–268, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450336642. doi: 10.1145/2783258.2783311. URL https://doi.org/10.1145/2783258.2783311.
  13. Interpretable explanations of black boxes by meaningful perturbation. In 2017 IEEE International Conference on Computer Vision (ICCV), pp.  3449–3457, Los Alamitos, CA, USA, oct 2017. IEEE Computer Society. doi: 10.1109/ICCV.2017.371. URL https://doi.ieeecomputersociety.org/10.1109/ICCV.2017.371.
  14. Fairness metrics: A comparative analysis. In 2020 IEEE International Conference on Big Data (Big Data), pp.  3662–3666, Los Alamitos, CA, USA, dec 2020. IEEE Computer Society. doi: 10.1109/BigData50022.2020.9378025. URL https://doi.ieeecomputersociety.org/10.1109/BigData50022.2020.9378025.
  15. Verifying individual fairness in machine learning models. In Peters, J. and Sontag, D. (eds.), Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), volume 124 of Proceedings of Machine Learning Research, pp.  749–758. PMLR, 03–06 Aug 2020. URL https://proceedings.mlr.press/v124/george-john20a.html.
  16. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  17. The case for process fairness in learning: Feature selection for fair decision making. In NIPS symposium on machine learning and the law, volume 1, pp.  11. Barcelona, Spain, 2016.
  18. Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  19. A comprehensive evaluation framework for deep model robustness. Pattern Recognition, 2023.
  20. Controllable Guarantees for Fair Outcomes via Contrastive Information Estimation. 2021.
  21. Equality of opportunity in supervised learning. In NIPS, 2016a.
  22. Equality of opportunity in supervised learning. Advances in neural information processing systems, 29, 2016b.
  23. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  770–778, 2016. doi: 10.1109/CVPR.2016.90.
  24. On the robustness of segment anything, 2023.
  25. Neural response interpretation through the lens of critical pathways. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  13528–13538, 2021.
  26. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pp.  1885–1894. JMLR.org, 2017.
  27. Imagenet classification with deep convolutional neural networks. In Pereira, F., Burges, C., Bottou, L., and Weinberger, K. (eds.), Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012. URL https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.
  28. Adversarial examples in the physical world. In Artificial intelligence safety and security, pp.  99–112. Chapman and Hall/CRC, 2018.
  29. Counterfactual fairness. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf.
  30. Understanding adversarial robustness via critical attacking route. Information Sciences, 547:568–578, 2021.
  31. Knowledge consistency between neural networks and beyond. arXiv preprint arXiv:1908.01581, 2019.
  32. Perceptual-sensitive gan for generating adversarial patches. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp.  1028–1035, 2019.
  33. Spatiotemporal attacks for embodied agents. In ECCV, 2020a.
  34. Bias-based universal adversarial patch attack for automatic check-out. In ECCV, 2020b.
  35. Training robust deep neural networks via adversarial noise propagation. IEEE TIP, 2021.
  36. X-adv: Physical adversarial object attacks against x-ray prohibited item detection. In USENIX Security, 2023.
  37. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
  38. Learning to Pivot with Adversarial Networks. ArXiv e-prints, November 2016.
  39. Learning adversarially fair and transferable representations. In ICML, 2018.
  40. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440, 2016.
  41. Importance estimation for neural network pruning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  11256–11264, 2019. doi: 10.1109/CVPR.2019.01152.
  42. Invariant representations without adversarial training. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, pp.  9102–9111, Red Hook, NY, USA, 2018. Curran Associates Inc.
  43. News, B. Ibm abandons ‘biased’ facial recognition tech, 2020. URL https://www.bbc.co.uk/news/technology-52978191.
  44. News, B. Ai at work: Staff ‘hired and fired by algorithm’, 2021. URL https://www.bbc.com/news/technology-56515827.
  45. Office, B. C. C. Compas recidivism risk score data and analysis. URL https://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis.
  46. Adversarial defense through network profiling based path extraction. pp.  4772–4781, 06 2019. doi: 10.1109/CVPR.2019.00491.
  47. Fairness by learning orthogonal disentangled representations. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M. (eds.), Computer Vision – ECCV 2020, pp.  746–761, Cham, 2020. Springer International Publishing. ISBN 978-3-030-58526-6.
  48. Tyler, T. Social justice: Outcome and procedure. International Journal of Psychology - INT J PSYCHOL, 35:117–125, 04 2000. doi: 10.1080/002075900399411.
  49. Tyler, T. R. Procedural justice, legitimacy, and the effective rule of law. Crime and Justice, 30:283–357, 2003. ISSN 01923234. URL http://www.jstor.org/stable/1147701.
  50. Mitigating algorithmic bias with limited annotations. arXiv preprint arXiv:2207.10018, 2022.
  51. Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  52. Interpret neural networks by identifying critical data routing paths. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  8906–8914, 2018a. doi: 10.1109/CVPR.2018.00928.
  53. Interpret neural networks by identifying critical data routing paths. In proceedings of the IEEE conference on computer vision and pattern recognition, pp.  8906–8914, 2018b.
  54. Towards fairness in visual recognition: Effective strategies for bias mitigation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  8916–8925, Los Alamitos, CA, USA, jun 2020. IEEE Computer Society. doi: 10.1109/CVPR42600.2020.00894. URL https://doi.ieeecomputersociety.org/10.1109/CVPR42600.2020.00894.
  55. Learning non-discriminatory predictors. In Kale, S. and Shamir, O. (eds.), Proceedings of the 2017 Conference on Learning Theory, volume 65 of Proceedings of Machine Learning Research, pp.  1920–1953. PMLR, 07–10 Jul 2017. URL https://proceedings.mlr.press/v65/woodworth17a.html.
  56. Latent imitator: Generating natural individual discriminatory instances for black-box fairness testing. In ISSTA, 2023.
  57. Npc: Neuron path coverage via characterizing decision logic of deep neural networks. ACM Trans. Softw. Eng. Methodol., 31(3), apr 2022. ISSN 1049-331X. doi: 10.1145/3490489. URL https://doi.org/10.1145/3490489.
  58. Learning fair representations. In Dasgupta, S. and McAllester, D. (eds.), Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pp.  325–333, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR. URL https://proceedings.mlr.press/v28/zemel13.html.
  59. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’18, pp.  335–340, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450360128. doi: 10.1145/3278721.3278779. URL https://doi.org/10.1145/3278721.3278779.
  60. Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. IEEE Transactions on Image Processing, 30:1291–1304, 2020a.
  61. Efficient white-box fairness testing through gradient search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2021, pp.  103–114, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450384599. doi: 10.1145/3460319.3464820. URL https://doi.org/10.1145/3460319.3464820.
  62. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp.  949–960, New York, NY, USA, 2020b. Association for Computing Machinery. ISBN 9781450371216. doi: 10.1145/3377811.3380331. URL https://doi.org/10.1145/3377811.3380331.
  63. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp.  949–960, New York, NY, USA, 2020c. Association for Computing Machinery. ISBN 9781450371216. doi: 10.1145/3377811.3380331. URL https://doi.org/10.1145/3377811.3380331.
  64. Visual interpretability for deep learning: a survey. Frontiers of Information Technology and Electronic Engineering, 19, 02 2018. doi: 10.1631/FITEE.1700808.
  65. Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), pp.  1519–1531, 2022. doi: 10.1145/3510003.3510123.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Tianlin Li (43 papers)
  2. Qing Guo (146 papers)
  3. Aishan Liu (72 papers)
  4. Mengnan Du (90 papers)
  5. Zhiming Li (47 papers)
  6. Yang Liu (2253 papers)
Citations (9)