Improving Fairness and Mitigating MADness in Generative Models
Abstract: Generative models unfairly penalize data belonging to minority classes, suffer from model autophagy disorder (MADness), and learn biased estimates of the underlying distribution parameters. Our theoretical and empirical results show that training generative models with intentionally designed hypernetworks leads to models that 1) are more fair when generating datapoints belonging to minority classes 2) are more stable in a self-consumed (i.e., MAD) setting, and 3) learn parameters that are less statistically biased. To further mitigate unfairness, MADness, and bias, we introduce a regularization term that penalizes discrepancies between a generative model's estimated weights when trained on real data versus its own synthetic data. To facilitate training existing deep generative models within our framework, we offer a scalable implementation of hypernetworks that automatically generates a hypernetwork architecture for any given generative model.
- D. H. Johnson, “Statistical signal processing,” URL http://www. ece. rice. edu/~ dhj/courses/elec531/notes. pdf. Lecture notes, 2013.
- Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin, “Variational Autoencoder for Deep Learning of Images, Labels and Captions,” in Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc., 2016. [Online]. Available: https://proceedings.neurips.cc/paper/2016/hash/eb86d510361fc23b59f18c1bc9802cc6-Abstract.html
- D. Rezende and S. Mohamed, “Variational Inference with Normalizing Flows,” in Proceedings of the 32nd International Conference on Machine Learning. PMLR, Jun. 2015, pp. 1530–1538, iSSN: 1938-7228. [Online]. Available: https://proceedings.mlr.press/v37/rezende15.html
- J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,” in Advances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 6840–6851. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html
- I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Networks,” Jun. 2014, arXiv:1406.2661 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1406.2661
- D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes,” Dec. 2022, arXiv:1312.6114 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1312.6114
- V. Vapnik, “An overview of statistical learning theory,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 988–999, Sep. 1999, conference Name: IEEE Transactions on Neural Networks. [Online]. Available: https://ieeexplore.ieee.org/document/788640
- ——, “Principles of Risk Minimization for Learning Theory,” in Advances in Neural Information Processing Systems, vol. 4. Morgan-Kaufmann, 1991. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/1991/hash/ff4d5fbbafdf976cfdc032e3bde78de5-Abstract.html
- J. Neyman and E. L. Scott, “Consistent Estimates Based on Partially Consistent Observations,” Econometrica, vol. 16, no. 1, pp. 1–32, 1948, publisher: [Wiley, Econometric Society]. [Online]. Available: https://www.jstor.org/stable/1914288
- A. DasGupta, “Maximum Likelihood Estimates,” in Asymptotic Theory of Statistics and Probability, A. DasGupta, Ed. New York, NY: Springer, 2008, pp. 235–258. [Online]. Available: https://doi.org/10.1007/978-0-387-75971-5_16
- S. M. Stigler, “The Epic Story of Maximum Likelihood,” Statistical Science, vol. 22, no. 4, pp. 598–620, 2007, publisher: Institute of Mathematical Statistics. [Online]. Available: https://www.jstor.org/stable/27645865
- C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” Feb. 2017, arXiv:1611.03530 [cs] Citaiton Key: zhang_rethinking_generalization. [Online]. Available: http://arxiv.org/abs/1611.03530
- M. Belkin, D. Hsu, S. Ma, and S. Mandal, “Reconciling modern machine learning practice and the bias-variance trade-off,” Proceedings of the National Academy of Sciences, vol. 116, no. 32, pp. 15 849–15 854, Aug. 2019, arXiv:1812.11118 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1812.11118
- S. Alemohammad, J. Casco-Rodriguez, L. Luzi, A. I. Humayun, H. Babaei, D. LeJeune, A. Siahkoohi, and R. G. Baraniuk, “Self-Consuming Generative Models Go MAD,” Jul. 2023. [Online]. Available: http://arxiv.org/abs/2307.01850
- S. B. Prusiner, “Neurodegenerative Diseases and Prions,” New England Journal of Medicine, vol. 344, no. 20, pp. 1516–1526, May 2001, publisher: Massachusetts Medical Society _eprint: https://doi.org/10.1056/NEJM200105173442006. [Online]. Available: https://doi.org/10.1056/NEJM200105173442006
- M. Briesch, D. Sobania, and F. Rothlauf, “Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop,” Nov. 2023. [Online]. Available: http://arxiv.org/abs/2311.16822
- A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” OpenAI, 2018, publisher: OpenAI.
- T. R. Tyler, “The relationship of the outcome and procedural fairness: How does knowing the outcome influence judgments about the procedure?” Social Justice Research, vol. 9, no. 4, pp. 311–325, Dec. 1996. [Online]. Available: https://doi.org/10.1007/BF02196988
- S. Zhao, H. Ren, A. Yuan, J. Song, N. Goodman, and S. Ermon, “Bias and Generalization in Deep Generative Models: An Empirical Study,” in Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/hash/5317b6799188715d5e00a638a4278901-Abstract.html
- K. Schwarz, Y. Liao, and A. Geiger, “On the Frequency Bias of Generative Models,” in Advances in Neural Information Processing Systems, vol. 34. Curran Associates, Inc., 2021, pp. 18 126–18 136. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/hash/96bf57c6ff19504ff145e2a32991ea96-Abstract.html
- D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep Image Prior,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, Utah: CVPR, 2018, pp. 9446–9454. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2018/html/Ulyanov_Deep_Image_Prior_CVPR_2018_paper.html
- J. Wakefield, “Frequentist Inference,” in Bayesian and Frequentist Regression Methods, ser. Springer Series in Statistics, J. Wakefield, Ed. New York, NY: Springer, 2013, pp. 27–83. [Online]. Available: https://doi.org/10.1007/978-1-4419-0925-1_2
- I. Fornacon-Wood, H. Mistry, C. Johnson-Hart, C. Faivre-Finn, J. P. B. O’Connor, and G. J. Price, “Understanding the Differences Between Bayesian and Frequentist Statistics,” International Journal of Radiation Oncology, Biology, Physics, vol. 112, no. 5, pp. 1076–1082, Apr. 2022, publisher: Elsevier. [Online]. Available: https://www.redjournal.org/article/S0360-3016(21)03256-9/fulltext
- A. Gelman, “Objections to Bayesian statistics,” Bayesian Analysis, vol. 3, no. 3, pp. 445–449, Sep. 2008, publisher: International Society for Bayesian Analysis. [Online]. Available: https://projecteuclid.org/journals/bayesian-analysis/volume-3/issue-3/Objections-to-Bayesian-statistics/10.1214/08-BA318.full
- E.-J. Wagenmakers, M. Lee, T. Lodewyckx, and G. J. Iverson, “Bayesian Versus Frequentist Inference,” in Bayesian Evaluation of Informative Hypotheses, ser. Statistics for Social and Behavioral Sciences, H. Hoijtink, I. Klugkist, and P. A. Boelen, Eds. New York, NY: Springer, 2008, pp. 181–207. [Online]. Available: https://doi.org/10.1007/978-0-387-09612-4_9
- R. J. Little, “Calibrated Bayes: A Bayes/Frequentist Roadmap,” The American Statistician, vol. 60, no. 3, pp. 213–223, Aug. 2006, publisher: Taylor & Francis _eprint: https://doi.org/10.1198/000313006X117837. [Online]. Available: https://doi.org/10.1198/000313006X117837
- D. B. Rubin, “Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician,” The Annals of Statistics, vol. 12, no. 4, pp. 1151–1172, Dec. 1984, publisher: Institute of Mathematical Statistics. [Online]. Available: https://projecteuclid.org/journals/annals-of-statistics/volume-12/issue-4/Bayesianly-Justifiable-and-Relevant-Frequency-Calculations-for-the-Applied-Statistician/10.1214/aos/1176346785.full
- D. Ha, A. M. Dai, and Q. V. Le, “HyperNetworks,” in International Conference on Learning Representations, 2017.
- S. T. Radev, U. K. Mertens, A. Voss, L. Ardizzone, and U. Köthe, “BayesFlow: Learning complex stochastic models with invertible neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 4, pp. 1452–1466, 2022.
- A. Brock, J. Donahue, and K. Simonyan, “Large scale gan training for high fidelity natural image synthesis,” arXiv preprint arXiv:1809.11096, 2018.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
- ——, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
- J. Buolamwini and T. Gebru, “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification,” in Proceedings of the 1st Conference on Fairness, Accountability and Transparency. PMLR, Jan. 2018, pp. 77–91, iSSN: 2640-3498. [Online]. Available: https://proceedings.mlr.press/v81/buolamwini18a.html
- J. Garson, “Modal Logic,” in The Stanford Encyclopedia of Philosophy, spring 2024 ed., E. N. Zalta and U. Nodelman, Eds. Metaphysics Research Lab, Stanford University, 2024. [Online]. Available: https://plato.stanford.edu/archives/spr2024/entries/logic-modal/
- C. Menzel, “Possible Worlds,” in The Stanford Encyclopedia of Philosophy, fall 2023 ed., E. N. Zalta and U. Nodelman, Eds. Metaphysics Research Lab, Stanford University, 2023. [Online]. Available: https://plato.stanford.edu/archives/fall2023/entries/possible-worlds/
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.