Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DSAP: Analyzing Bias Through Demographic Comparison of Datasets (2312.14626v1)

Published 22 Dec 2023 in cs.CV

Abstract: In the last few years, Artificial Intelligence systems have become increasingly widespread. Unfortunately, these systems can share many biases with human decision-making, including demographic biases. Often, these biases can be traced back to the data used for training, where large uncurated datasets have become the norm. Despite our knowledge of these biases, we still lack general tools to detect and quantify them, as well as to compare the biases in different datasets. Thus, in this work, we propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets. DSAP can be deployed in three key applications: to detect and characterize demographic blind spots and bias issues across datasets, to measure dataset demographic bias in single datasets, and to measure dataset demographic shift in deployment scenarios. An essential feature of DSAP is its ability to robustly analyze datasets without explicit demographic labels, offering simplicity and interpretability for a wide range of situations. To show the usefulness of the proposed methodology, we consider the Facial Expression Recognition task, where demographic bias has previously been found. The three applications are studied over a set of twenty datasets with varying properties. The code is available at https://github.com/irisdominguez/DSAP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 140:5485–140:5551, Jan. 2020.
  2. J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,” Dec. 2020.
  3. L. Beyer, O. J. Hénaff, A. Kolesnikov, X. Zhai, and A. van den Oord, “Are we done with ImageNet?” arXiv:2006.07159 [cs], Jun. 2020. [Online]. Available: http://arxiv.org/abs/2006.07159
  4. A. Birhane and V. U. Prabhu, “Large image datasets: A pyrrhic win for computer vision?” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).   Waikoloa, HI, USA: IEEE, Jan. 2021, pp. 1536–1546.
  5. C. Schumann, S. Ricco, U. Prabhu, V. Ferrari, and C. Pantofaru, “A Step Toward More Inclusive People Annotations for Fairness,” in Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society.   Virtual Event USA: ACM, Jul. 2021, pp. 916–925.
  6. S. Surabhi, B. Shah, P. Washington, O. C. Mutlu, E. Leblanc, P. Mohite, A. Husic, A. Kline, K. Dunlap, M. McNealis, B. Liu, N. Deveaux, E. Sleiman, and D. P. Wall, “TikTok for Good: Creating a Diverse Emotion Expression Database,” p. 11, 2022.
  7. I. Dominguez-Catena, D. Paternain, and M. Galar, “Assessing Demographic Bias Transfer from Dataset to Model: A Case Study in Facial Expression Recognition,” in Proceedings of the Workshop on Artificial Intelligence Safety 2022 (AISafety 2022), Vienna, Austria, Jul. 2022.
  8. E. Ntoutsi, P. Fafalios, U. Gadiraju, V. Iosifidis, W. Nejdl, M.-E. Vidal, S. Ruggieri, F. Turini, S. Papadopoulos, E. Krasanakis, I. Kompatsiaris, K. Kinder-Kurlanda, C. Wagner, F. Karimi, M. Fernandez, H. Alani, B. Berendt, T. Kruegel, C. Heinze, K. Broelemann, G. Kasneci, T. Tiropanis, and S. Staab, “Bias in data-driven artificial intelligence systems—An introductory survey,” WIREs Data Mining and Knowledge Discovery, vol. 10, no. 3, May 2020.
  9. H. Suresh and J. Guttag, “A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle,” in Equity and Access in Algorithms, Mechanisms, and Optimization.   – NY USA: ACM, Oct. 2021, pp. 1–9.
  10. S. Park, “Heterogeneity of AI-Induced Societal Harms and the Failure of Omnibus AI Laws,” Mar. 2023.
  11. M. Hort, Z. Chen, J. M. Zhang, F. Sarro, and M. Harman, “Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey,” 2022.
  12. I. Dominguez-Catena, D. Paternain, and M. Galar, “Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition,” Mar. 2023.
  13. L. Jost, “Entropy and diversity,” Oikos, vol. 113, no. 2, pp. 363–375, May 2006.
  14. E. Pielou, “The measurement of diversity in different types of biological collections,” Journal of Theoretical Biology, vol. 13, pp. 131–144, Dec. 1966.
  15. H. Cramér, “Chapter 21. The two-dimensional case,” in Mathematical Methods of Statistics, ser. Princeton Mathematical Series.   Princeton: Princeton university press, 1991, no. 9, p. 282.
  16. L. Gao, S. Biderman, S. Black, L. Golding, T. Hoppe, C. Foster, J. Phang, H. He, A. Thite, N. Nabeshima, S. Presser, and C. Leahy, “The Pile: An 800GB Dataset of Diverse Text for Language Modeling,” Dec. 2020.
  17. M. V. Wilson and A. Shmida, “Measuring Beta Diversity with Presence-Absence Data,” Journal of Ecology, vol. 72, no. 3, pp. 1055–1064, 1984.
  18. W. S. Robinson, “A Method for Chronologically Ordering Archaeological Deposits,” American Antiquity, vol. 16, no. 4, pp. 293–301, Apr. 1951.
  19. L. R. Dice, “Measures of the Amount of Ecologic Association Between Species,” Ecology, vol. 26, no. 3, pp. 297–302, Jul. 1945.
  20. J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang, “Learning under Concept Drift: A Review,” IEEE Transactions on Knowledge and Data Engineering, pp. 1–1, 2018.
  21. E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang, “Training deep networks for facial expression recognition with crowd-sourced label distribution,” in Proceedings of the 18th ACM International Conference on Multimodal Interaction.   Tokyo Japan: ACM, Oct. 2016, pp. 279–283.
  22. P. Ekman and W. V. Friesen, “Constants across cultures in the face and emotion.” Journal of Personality and Social Psychology, vol. 17, no. 2, pp. 124–129, 1971.
  23. I. Dominguez-Catena, D. Paternain, and M. Galar, “Gender Stereotyping Impact in Facial Expression Recognition,” in Machine Learning and Principles and Practice of Knowledge Discovery in Databases.   Cham: Springer Nature Switzerland, 2023, vol. 1752, pp. 9–22.
  24. D. Pessach and E. Shmueli, “Algorithmic Fairness,” arXiv:2001.09784 [cs, stat], Jan. 2020. [Online]. Available: http://arxiv.org/abs/2001.09784
  25. R. N. Landers and T. S. Behrend, “Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models.” American Psychologist, vol. 78, no. 1, p. 36, Feb. 2022.
  26. E. Kim, D. Bryant, D. Srikanth, and A. Howard, “Age Bias in Emotion Detection: An Analysis of Facial Emotion Recognition Performance on Young, Middle-Aged, and Older Adults,” in Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society.   New York, NY, USA: Association for Computing Machinery, Jul. 2021, pp. 638–644. [Online]. Available: https://doi.org/10.1145/3461702.3462609
  27. J. Deng, W. Dong, R. Socher, L.-J. Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition.   Miami, FL: IEEE, Jun. 2009, pp. 248–255.
  28. J. Howard and S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 328–339.
  29. S. Long, F. Cao, S. C. Han, and H. Yang, “Vision-and-Language Pretrained Models: A Survey,” 2022.
  30. S. Fabbrizzi, S. Papadopoulos, E. Ntoutsi, and I. Kompatsiaris, “A survey on bias in visual datasets,” Computer Vision and Image Understanding, vol. 223, p. 103552, Oct. 2022.
  31. T. Bänziger, M. Mortillaro, and K. R. Scherer, “Introducing the Geneva Multimodal expression corpus for experimental research on emotion perception,” Emotion, vol. 12, no. 5, pp. 1161–1179, 2012.
  32. H. Guerdelli, C. Ferrari, W. Barhoumi, H. Ghazouani, and S. Berretti, “Macro- and Micro-Expressions Facial Datasets: A Survey,” Sensors, vol. 22, no. 4, p. 1524, Feb. 2022.
  33. K. Karkkainen and J. Joo, “FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).   Waikoloa, HI, USA: IEEE, Jan. 2021, pp. 1547–1557.
  34. M. Robnik Šikonja, “Dataset comparison workflows,” International Journal of Data Science, vol. 3, no. 2, p. 126, 2018.
  35. J. Coutinho-Almeida, R. J. Cruz-Correia, and P. P. Rodrigues, “Dataset Comparison Tool: Utility and Privacy,” Studies in Health Technology and Informatics, May 2022.
  36. C. Ricotta and J. Podani, “On some properties of the Bray-Curtis dissimilarity and their ecological meaning,” Ecological Complexity, vol. 31, pp. 201–205, Sep. 2017.
  37. C. Lozupone and R. Knight, “UniFrac: A New Phylogenetic Method for Comparing Microbial Communities,” Applied and Environmental Microbiology, vol. 71, no. 12, pp. 8228–8235, Dec. 2005.
  38. C. Dulhanty and A. Wong, “Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets,” arXiv:1905.01347 [cs], Jun. 2019. [Online]. Available: http://arxiv.org/abs/1905.01347
  39. J. Buolamwini and T. Gebru, “Gender shades: Intersectional accuracy disparities in commercial gender classification,” in Proceedings of the 1st Conference on Fairness, Accountability and Transparency, ser. Proceedings of Machine Learning Research, S. A. Friedler and C. Wilson, Eds., vol. 81.   PMLR, Feb. 2018, pp. 77–91. [Online]. Available: https://proceedings.mlr.press/v81/buolamwini18a.html
  40. J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang, “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.   Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 2979–2989.
  41. B. Kim, H. Kim, K. Kim, S. Kim, and J. Kim, “Learning Not to Learn: Training Deep Neural Networks With Biased Data,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).   Long Beach, CA, USA: IEEE, Jun. 2019, pp. 9004–9012.
  42. F. Bayram, B. S. Ahmed, and A. Kassler, “From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors,” Mar. 2022.
  43. J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodríguez, N. V. Chawla, and F. Herrera, “A unifying view on dataset shift in classification,” Pattern Recognition, vol. 45, no. 1, pp. 521–530, Jan. 2012.
  44. J. Gama, P. Medas, G. Castillo, and P. Rodrigues, “Learning with Drift Detection,” in Advances in Artificial Intelligence – SBIA 2004, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, A. L. C. Bazzan, and S. Labidi, Eds., vol. 3171.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 286–295.
  45. G. I. Webb, R. Hyde, H. Cao, H. L. Nguyen, and F. Petitjean, “Characterizing Concept Drift,” Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 964–994, Jul. 2016.
  46. G. Assuncao, B. Patrao, M. Castelo-Branco, and P. Menezes, “An Overview of Emotion in Artificial Intelligence,” IEEE Transactions on Artificial Intelligence, pp. 1–1, 2022.
  47. R. T. Bjornsdottir and N. O. Rule, “The visibility of social class from facial cues,” Journal of Personality and Social Psychology, vol. 113, no. 4, pp. 530–546, Oct. 2017.
  48. R. Raghavendra, K. B. Raja, and C. Busch, “Impact of Drug Abuse on Face Recognition Systems: A Preliminary Study,” in Proceedings of the 9th International Conference on Security of Information and Networks.   Newark NJ USA: ACM, Jul. 2016, pp. 24–27.
  49. Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “VGGFace2: A dataset for recognising faces across pose and age,” May 2018.
  50. J. Chen, N. Kallus, X. Mao, G. Svacha, and M. Udell, “Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved,” in Proceedings of the Conference on Fairness, Accountability, and Transparency, Jan. 2019, pp. 339–348.
  51. R. H. Whittaker, “A Study of Summer Foliage Insect Communities in the Great Smoky Mountains,” Ecological Monographs, vol. 22, no. 1, pp. 2–44, 1952.
  52. D. A. Brock, “Comparison of Community Similarity Indexes,” Journal (Water Pollution Control Federation), vol. 49, no. 12, pp. 2488–2494, 1977.
  53. M. Ružička, “Anwendung mathematisch–statisticher Methoden in der Geobotanik (synthetische Bearbeitung von Aufnahmen),” Biologia, Bratislava, vol. 13, p. 647, 1958.
  54. E. Marczewski and H. Steinhaus, “On a certain distance of sets and the corresponding distance of functions,” Colloquium Mathematicum, vol. 6, no. 1, pp. 319–327, 1958.
  55. O. Keyes, “The Misgendering Machines: Trans/HCI Implications of Automatic Gender Recognition,” Proceedings of the ACM on Human-Computer Interaction, vol. 2, no. CSCW, pp. 1–22, Nov. 2018.
  56. S. Li and W. Deng, “Deep Facial Expression Recognition: A Survey,” IEEE Transactions on Affective Computing, pp. 1–1, 2020.
  57. M. M. Oliver and E. A. Alcover, “UIBVFED: Virtual facial expression dataset,” PLOS ONE, vol. 15, no. 4, p. e0231266, Apr. 2020.
  58. M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with Gabor wavelets,” in Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.   Nara, Japan: IEEE Comput. Soc, 1998, pp. 200–205.
  59. M. J. Lyons, “”Excavating AI” Re-excavated: Debunking a Fallacious Account of the JAFFE Dataset,” arXiv:2107.13998 [cs], Jul. 2021. [Online]. Available: http://arxiv.org/abs/2107.13998
  60. D. Lundquist, A. Flykt, and A. Ohman, “Karolinska directed emotional faces,” 1998.
  61. T. Kanade, J. Cohn, and Y. Tian, “Comprehensive database for facial expression analysis,” in Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), Mar. 2000, pp. 46–53.
  62. G. Zhao, X. Huang, M. Taini, S. Z. Li, and M. Pietikäinen, “Facial expression recognition from near-infrared videos,” Image and Vision Computing, vol. 29, no. 9, pp. 607–619, Aug. 2011.
  63. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.   San Francisco, CA, USA: IEEE, Jun. 2010, pp. 94–101.
  64. N. Aifanti, C. Papachristou, and A. Delopoulos, “The MUG facial expression database,” 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, Apr. 2010. [Online]. Available: https://www.semanticscholar.org/paper/The-MUG-facial-expression-database-Aifanti-Papachristou/f1af714b92372c8e606485a3982eab2f16772ad8
  65. A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark,” in 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Nov. 2011, pp. 2106–2112.
  66. I. J. Goodfellow, D. Erhan, P. Luc Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler, D.-H. Lee, Y. Zhou, C. Ramaiah, F. Feng, R. Li, X. Wang, D. Athanasakis, J. Shawe-Taylor, M. Milakov, J. Park, R. Ionescu, M. Popescu, C. Grozea, J. Bergstra, J. Xie, L. Romaszko, B. Xu, Z. Chuang, and Y. Bengio, “Challenges in representation learning: A report on three machine learning contests,” Neural Networks, vol. 64, pp. 59–63, Apr. 2015.
  67. M. Olszanowski, G. Pochwatko, K. Kuklinski, M. Scibor-Rylski, P. Lewinski, and R. K. Ohme, “Warsaw set of emotional facial expression pictures: A validation study of facial display photographs,” Frontiers in Psychology, vol. 5, Jan. 2015.
  68. J. van der Schalk, S. T. Hawk, A. H. Fischer, and B. Doosje, “Moving faces, looking places: Validation of the Amsterdam Dynamic Facial Expression Set (ADFES),” Emotion, vol. 11, no. 4, pp. 907–920, 2011.
  69. A. Mollahosseini, B. Hasani, and M. H. Mahoor, “AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild,” IEEE Transactions on Affective Computing, vol. 10, no. 1, pp. 18–31, Jan. 2019.
  70. Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “From Facial Expression Recognition to Interpersonal Relation Prediction,” International Journal of Computer Vision, vol. 126, no. 5, pp. 550–569, May 2018.
  71. S. Li, W. Deng, and J. Du, “Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).   Honolulu, HI: IEEE, Jul. 2017, pp. 2584–2593.
  72. S. Li and W. Deng, “Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 356–370, 2019.
  73. J. Lee, S. Kim, S. Kim, J. Park, and K. Sohn, “Context-Aware Emotion Recognition Networks,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV).   Seoul, Korea (South): IEEE, Oct. 2019, pp. 10 142–10 151.
  74. R. A. Khan, A. Crenn, A. Meyer, and S. Bouakaz, “A novel database of children’s spontaneous facial expressions (LIRIS-CSE),” Image and Vision Computing, vol. 83–84, pp. 61–69, Mar. 2019.
  75. S. Singh and S. Benedict, “Indian Semi-Acted Facial Expression (iSAFE) Dataset for Human Emotions Recognition,” in Advances in Signal Processing and Intelligent Recognition Systems, S. M. Thampi, R. M. Hegde, S. Krishnan, J. Mukhopadhyay, V. Chaudhary, O. Marques, S. Piramuthu, and J. M. Corchado, Eds.   Singapore: Springer, 2020, pp. 150–162.
  76. “MMA FACIAL EXPRESSION.” [Online]. Available: https://www.kaggle.com/mahmoudima/mma-facial-expression
  77. “Natural Human Face Images for Emotion Recognition.” [Online]. Available: https://www.kaggle.com/sudarshanvaidya/random-images-for-face-emotion-recognition
  78. D. E. King, “Max-Margin Object Detection,” arXiv:1502.00046 [cs], Jan. 2015. [Online]. Available: http://arxiv.org/abs/1502.00046
  79. “World Population Prospects - Population Division - United Nations.” [Online]. Available: https://population.un.org/wpp/Download/Standard/CSV/
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Iris Dominguez-Catena (7 papers)
  2. Daniel Paternain (6 papers)
  3. Mikel Galar (13 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.