Papers
Topics
Authors
Recent
Search
2000 character limit reached

Improving Fairness and Mitigating MADness in Generative Models

Published 22 May 2024 in cs.LG and stat.ML | (2405.13977v3)

Abstract: Generative models unfairly penalize data belonging to minority classes, suffer from model autophagy disorder (MADness), and learn biased estimates of the underlying distribution parameters. Our theoretical and empirical results show that training generative models with intentionally designed hypernetworks leads to models that 1) are more fair when generating datapoints belonging to minority classes 2) are more stable in a self-consumed (i.e., MAD) setting, and 3) learn parameters that are less statistically biased. To further mitigate unfairness, MADness, and bias, we introduce a regularization term that penalizes discrepancies between a generative model's estimated weights when trained on real data versus its own synthetic data. To facilitate training existing deep generative models within our framework, we offer a scalable implementation of hypernetworks that automatically generates a hypernetwork architecture for any given generative model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. D. H. Johnson, “Statistical signal processing,” URL http://www. ece. rice. edu/~ dhj/courses/elec531/notes. pdf. Lecture notes, 2013.
  2. Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin, “Variational Autoencoder for Deep Learning of Images, Labels and Captions,” in Advances in Neural Information Processing Systems, vol. 29.   Curran Associates, Inc., 2016. [Online]. Available: https://proceedings.neurips.cc/paper/2016/hash/eb86d510361fc23b59f18c1bc9802cc6-Abstract.html
  3. D. Rezende and S. Mohamed, “Variational Inference with Normalizing Flows,” in Proceedings of the 32nd International Conference on Machine Learning.   PMLR, Jun. 2015, pp. 1530–1538, iSSN: 1938-7228. [Online]. Available: https://proceedings.mlr.press/v37/rezende15.html
  4. J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,” in Advances in Neural Information Processing Systems, vol. 33.   Curran Associates, Inc., 2020, pp. 6840–6851. [Online]. Available: https://proceedings.neurips.cc/paper/2020/hash/4c5bcfec8584af0d967f1ab10179ca4b-Abstract.html
  5. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Networks,” Jun. 2014, arXiv:1406.2661 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1406.2661
  6. D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes,” Dec. 2022, arXiv:1312.6114 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1312.6114
  7. V. Vapnik, “An overview of statistical learning theory,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 988–999, Sep. 1999, conference Name: IEEE Transactions on Neural Networks. [Online]. Available: https://ieeexplore.ieee.org/document/788640
  8. ——, “Principles of Risk Minimization for Learning Theory,” in Advances in Neural Information Processing Systems, vol. 4.   Morgan-Kaufmann, 1991. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/1991/hash/ff4d5fbbafdf976cfdc032e3bde78de5-Abstract.html
  9. J. Neyman and E. L. Scott, “Consistent Estimates Based on Partially Consistent Observations,” Econometrica, vol. 16, no. 1, pp. 1–32, 1948, publisher: [Wiley, Econometric Society]. [Online]. Available: https://www.jstor.org/stable/1914288
  10. A. DasGupta, “Maximum Likelihood Estimates,” in Asymptotic Theory of Statistics and Probability, A. DasGupta, Ed.   New York, NY: Springer, 2008, pp. 235–258. [Online]. Available: https://doi.org/10.1007/978-0-387-75971-5_16
  11. S. M. Stigler, “The Epic Story of Maximum Likelihood,” Statistical Science, vol. 22, no. 4, pp. 598–620, 2007, publisher: Institute of Mathematical Statistics. [Online]. Available: https://www.jstor.org/stable/27645865
  12. C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” Feb. 2017, arXiv:1611.03530 [cs] Citaiton Key: zhang_rethinking_generalization. [Online]. Available: http://arxiv.org/abs/1611.03530
  13. M. Belkin, D. Hsu, S. Ma, and S. Mandal, “Reconciling modern machine learning practice and the bias-variance trade-off,” Proceedings of the National Academy of Sciences, vol. 116, no. 32, pp. 15 849–15 854, Aug. 2019, arXiv:1812.11118 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1812.11118
  14. S. Alemohammad, J. Casco-Rodriguez, L. Luzi, A. I. Humayun, H. Babaei, D. LeJeune, A. Siahkoohi, and R. G. Baraniuk, “Self-Consuming Generative Models Go MAD,” Jul. 2023. [Online]. Available: http://arxiv.org/abs/2307.01850
  15. S. B. Prusiner, “Neurodegenerative Diseases and Prions,” New England Journal of Medicine, vol. 344, no. 20, pp. 1516–1526, May 2001, publisher: Massachusetts Medical Society _eprint: https://doi.org/10.1056/NEJM200105173442006. [Online]. Available: https://doi.org/10.1056/NEJM200105173442006
  16. M. Briesch, D. Sobania, and F. Rothlauf, “Large Language Models Suffer From Their Own Output: An Analysis of the Self-Consuming Training Loop,” Nov. 2023. [Online]. Available: http://arxiv.org/abs/2311.16822
  17. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” OpenAI, 2018, publisher: OpenAI.
  18. T. R. Tyler, “The relationship of the outcome and procedural fairness: How does knowing the outcome influence judgments about the procedure?” Social Justice Research, vol. 9, no. 4, pp. 311–325, Dec. 1996. [Online]. Available: https://doi.org/10.1007/BF02196988
  19. S. Zhao, H. Ren, A. Yuan, J. Song, N. Goodman, and S. Ermon, “Bias and Generalization in Deep Generative Models: An Empirical Study,” in Advances in Neural Information Processing Systems, vol. 31.   Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/hash/5317b6799188715d5e00a638a4278901-Abstract.html
  20. K. Schwarz, Y. Liao, and A. Geiger, “On the Frequency Bias of Generative Models,” in Advances in Neural Information Processing Systems, vol. 34.   Curran Associates, Inc., 2021, pp. 18 126–18 136. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/hash/96bf57c6ff19504ff145e2a32991ea96-Abstract.html
  21. D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep Image Prior,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Salt Lake City, Utah: CVPR, 2018, pp. 9446–9454. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2018/html/Ulyanov_Deep_Image_Prior_CVPR_2018_paper.html
  22. J. Wakefield, “Frequentist Inference,” in Bayesian and Frequentist Regression Methods, ser. Springer Series in Statistics, J. Wakefield, Ed.   New York, NY: Springer, 2013, pp. 27–83. [Online]. Available: https://doi.org/10.1007/978-1-4419-0925-1_2
  23. I. Fornacon-Wood, H. Mistry, C. Johnson-Hart, C. Faivre-Finn, J. P. B. O’Connor, and G. J. Price, “Understanding the Differences Between Bayesian and Frequentist Statistics,” International Journal of Radiation Oncology, Biology, Physics, vol. 112, no. 5, pp. 1076–1082, Apr. 2022, publisher: Elsevier. [Online]. Available: https://www.redjournal.org/article/S0360-3016(21)03256-9/fulltext
  24. A. Gelman, “Objections to Bayesian statistics,” Bayesian Analysis, vol. 3, no. 3, pp. 445–449, Sep. 2008, publisher: International Society for Bayesian Analysis. [Online]. Available: https://projecteuclid.org/journals/bayesian-analysis/volume-3/issue-3/Objections-to-Bayesian-statistics/10.1214/08-BA318.full
  25. E.-J. Wagenmakers, M. Lee, T. Lodewyckx, and G. J. Iverson, “Bayesian Versus Frequentist Inference,” in Bayesian Evaluation of Informative Hypotheses, ser. Statistics for Social and Behavioral Sciences, H. Hoijtink, I. Klugkist, and P. A. Boelen, Eds.   New York, NY: Springer, 2008, pp. 181–207. [Online]. Available: https://doi.org/10.1007/978-0-387-09612-4_9
  26. R. J. Little, “Calibrated Bayes: A Bayes/Frequentist Roadmap,” The American Statistician, vol. 60, no. 3, pp. 213–223, Aug. 2006, publisher: Taylor & Francis _eprint: https://doi.org/10.1198/000313006X117837. [Online]. Available: https://doi.org/10.1198/000313006X117837
  27. D. B. Rubin, “Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician,” The Annals of Statistics, vol. 12, no. 4, pp. 1151–1172, Dec. 1984, publisher: Institute of Mathematical Statistics. [Online]. Available: https://projecteuclid.org/journals/annals-of-statistics/volume-12/issue-4/Bayesianly-Justifiable-and-Relevant-Frequency-Calculations-for-the-Applied-Statistician/10.1214/aos/1176346785.full
  28. D. Ha, A. M. Dai, and Q. V. Le, “HyperNetworks,” in International Conference on Learning Representations, 2017.
  29. S. T. Radev, U. K. Mertens, A. Voss, L. Ardizzone, and U. Köthe, “BayesFlow: Learning complex stochastic models with invertible neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 4, pp. 1452–1466, 2022.
  30. A. Brock, J. Donahue, and K. Simonyan, “Large scale gan training for high fidelity natural image synthesis,” arXiv preprint arXiv:1809.11096, 2018.
  31. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
  32. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
  33. ——, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
  34. J. Buolamwini and T. Gebru, “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification,” in Proceedings of the 1st Conference on Fairness, Accountability and Transparency.   PMLR, Jan. 2018, pp. 77–91, iSSN: 2640-3498. [Online]. Available: https://proceedings.mlr.press/v81/buolamwini18a.html
  35. J. Garson, “Modal Logic,” in The Stanford Encyclopedia of Philosophy, spring 2024 ed., E. N. Zalta and U. Nodelman, Eds.   Metaphysics Research Lab, Stanford University, 2024. [Online]. Available: https://plato.stanford.edu/archives/spr2024/entries/logic-modal/
  36. C. Menzel, “Possible Worlds,” in The Stanford Encyclopedia of Philosophy, fall 2023 ed., E. N. Zalta and U. Nodelman, Eds.   Metaphysics Research Lab, Stanford University, 2023. [Online]. Available: https://plato.stanford.edu/archives/fall2023/entries/possible-worlds/

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.