A New Paradigm for Generative Adversarial Networks based on Randomized Decision Rules (2306.13641v1)
Abstract: The Generative Adversarial Network (GAN) was recently introduced in the literature as a novel machine learning method for training generative models. It has many applications in statistics such as nonparametric clustering and nonparametric conditional independence tests. However, training the GAN is notoriously difficult due to the issue of mode collapse, which refers to the lack of diversity among generated data. In this paper, we identify the reasons why the GAN suffers from this issue, and to address it, we propose a new formulation for the GAN based on randomized decision rules. In the new formulation, the discriminator converges to a fixed point while the generator converges to a distribution at the Nash equilibrium. We propose to train the GAN by an empirical Bayes-like method by treating the discriminator as a hyper-parameter of the posterior distribution of the generator. Specifically, we simulate generators from its posterior distribution conditioned on the discriminator using a stochastic gradient Markov chain Monte Carlo (MCMC) algorithm, and update the discriminator using stochastic gradient descent along with simulations of the generators. We establish convergence of the proposed method to the Nash equilibrium. Apart from image generation, we apply the proposed method to nonparametric clustering and nonparametric conditional independence tests. A portion of the numerical results is presented in the supplementary material.
- Stability of stochastic approximation under verifiable conditions. SIAM Journal on Control and Optimization 44(1), 283–312.
- Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh (Eds.), Proceedings of the 34 th International Conference on Machine Learning (ICML), Volume 70 of PMLR, International Convention Centre, Sydney, Australia, pp. 214–223.
- Arnold, B. C. and S. J. Press (1989). Compatible conditional distributions. Journal of the American Statistical Association 84(405), 152–156.
- Generalization and equilibrium in generative adversarial nets (GANs). In ICML, pp. 224–232.
- Bellot, A. and M. van der Schaar (2019). Conditional independence testing using generative adversarial networks. In NeurIPS, pp. 2202–2211.
- Adaptive Algorithms and Stochastic Approximations. Springer.
- The conditional permutation test for independence while controlling for confounders. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82.
- Demystifying MMD GANs. In ICLR.
- Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80(3), 551–577.
- Mode regularized generative adversarial networks. In ICLR.
- On the convergence of stochastic gradient mcmc algorithms with high-order integrators. In Advances in Neural Information Processing Systems, pp. 2278–2286.
- Stochastic approximation procedures with randomly varying truncations. Science in China Series A-Mathematics, Physics, Astronomy & Technological Science 29(9), 914–926.
- Stochastic gradient hamiltonian monte carlo. In ICML.
- Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38.
- An adaptive empirical bayesian method for sparse deep learning. NeurIPS 2019.
- A stochastic approximation-langevinized ensemble kalman filter for state space models with unknown parameters. Journal of Computational and Graphical Statistics 32(2), 448–469.
- A permutation-based kernel conditional independence test. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, UAI’14, Arlington, Virginia, USA, pp. 132–141. AUAI Press.
- Some upper bounds for relative entropy and applications. Computers & Mathematics with Applications 39(9), 91–100.
- Global convergence of stochastic gradient hamiltonian monte carlo for nonconvex stochastic optimization: Nonasymptotic performance bounds and momentum-based acceleration. Operations Research.
- Deep generative learning via variational gradient flow. In ICML, pp. 2093–2101.
- Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483(7391), 570—575.
- Multi-agent diverse generative adversarial networks. In CVPR, pp. 8513–8521.
- Generative adversarial nets. NIPS, 2672–2680.
- Probgan: Towards probabilistic gan with theoretical guarantees. ICLR.
- MGAN: Training generative adversarial nets with multiple generators. In ICLR.
- The validity of posterior expansions based on Laplace’s method. In S. Geisser, J. S. Hodges, S. J. Press, and A. ZeUner (Eds.), Bayesian and likelihood methods in statistics and econometrics: essays in honor of George A. Barnard, Volume 7, pp. 473–488. Amsterdam: North Holland.
- Stochastic gradient langevin dynamics with adaptive drifts. Journal of Statistical Computation and Simulation 92(2), 318–336.
- Adam: a method for stochastic optimization. In International Conference on Learning Representations.
- Adam: A method for stochastic optimization. ICLR, 1–13.
- Optimization by simulated annealing. Science 220, 671–680.
- Lauritzen, S. (1996). Graphical Models. Clarendon Press.
- Preconditioned stochastic gradient langevin dynamics for deep neural networks. In AAAI.
- On nonparametric conditional independence tests for continuous variables. Wiley Interdisciplinary Reviews: Computational Statistics 12.
- An equivalent measure of partial correlation coefficients for high dimensional gaussian graphical models. Journal of the American Statistical Association 110, 1248–1265.
- Density estimation using deep generative neural networks. Proceedings of the National Academy of Sciences 118(15), e2101344118.
- The randomized dependence coefficient. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems, Volume 26, pp. 1–9. Curran Associates, Inc.
- Least squares generative adversarial networks. In ICCV, pp. 2813–2821.
- Morris, C. N. (1983). Parametric empirical bayes inference: Theory and applications. Journal of the American Statistical Association 78(381), 47–55.
- Clustergan: Latent space clustering in generative adversarial networks. In AAAI, pp. 4610–4617. AAAI Press.
- f-gan: Training generative neural samplers using variational divergence minimization. In NIPS.
- Causal inference in statistics: An overview. Statistics Surveys 3, 96–146.
- Pérez-Cruz, F. (2008). Kullback-leibler divergence estimation of continuous distributions. 2008 IEEE International Symposium on Information Theory, 1666–1670.
- Non-convex learning via stochastic gradient langevin dynamics: a nonasymptotic analysis. In Conference on Learning Theory, pp. 1674–1703. PMLR.
- Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850.
- Saatci, Y. and A. G. Wilson (2017). Bayesian gan. In NIPS, pp. 3622–3631.
- Improved techniques for training gans. In NIPS, pp. 2234–2232.
- Model-powered conditional independence test. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Red Hook, NY, USA, pp. 2955–2965. Curran Associates Inc.
- Nonparametric density estimation under adversarial losses. In NeurIPS.
- Melanoma therapeutic strategies that select against resistance by exploiting myc-driven evolutionary convergence. Cell Reports 21(10), 2796 – 2812.
- Extended stochastic gradient mcmc for large-scale bayesian variable selection. Biometrika 107(4), 997–1004.
- Causation, prediction and search. New York: Springer.
- Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference 7(1), 20180017.
- Consistent sparse deep learning: Theory and computation. Journal of the American Statistical Association 117(540), 1981–1995.
- Learning sparse deep neural networks with a spike-and-slab prior. Statistics & Probability Letters 180, 109246.
- Adagan: Boosting generative models. In NIPS, pp. 5424–5433.
- Combating acquired resistance to mapk inhibitors in melanoma by targeting abl1/2-mediated reactivation of mek/erk/myc signaling. Nature Communications 5463.
- Divergence estimation for multidimensional densities via $k$-nearest-neighbor distances. IEEE Transactions on Information Theory 55, 2392–2405.
- Ensembles of generative adversarial networks. ArXiv abs/1612.00991.
- Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 600–612.
- Welling, M. and Y. W. Teh (2011). Bayesian learning via stochastic gradient Langevin dynamics. In ICML.
- Consistency and fluctuations for stochastic gradient langevin dynamics. Journal of Machine Learning Research.
- Wiatrak, M. and S. V. Albrecht (2019). Stabilizing generative adversarial network training: A survey. ArXiv abs/1910.00927.
- An empirical study on evaluation metrics of generative adversarial networks. ArXiv abs/1806.07755.
- Essentials of Statistical Inference. London: Cambridge University Press.
- Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. CoRR abs/1212.5701.
- Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, UAI’11, Arlington, Virginia, USA, pp. 804–813. AUAI Press.
- Feature-to-feature regression for a two-step conditional independence test. In UAI.
- GAN-EM: GAN based EM learning framework. In IJCAI, pp. 4404–4411.
- Lipschitz generative adversarial nets. In ICML, pp. 7584–7593.
- Sehwan Kim (9 papers)
- Qifan Song (37 papers)
- Faming Liang (33 papers)