Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 102 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 30 tok/s
GPT-5 High 27 tok/s Pro
GPT-4o 110 tok/s
GPT OSS 120B 475 tok/s Pro
Kimi K2 203 tok/s Pro
2000 character limit reached

Penalized Generative Variable Selection (2402.16661v1)

Published 26 Feb 2024 in stat.ML, cs.LG, and stat.ME

Abstract: Deep networks are increasingly applied to a wide variety of data, including data with high-dimensional predictors. In such analysis, variable selection can be needed along with estimation/model building. Many of the existing deep network studies that incorporate variable selection have been limited to methodological and numerical developments. In this study, we consider modeling/estimation using the conditional Wasserstein Generative Adversarial networks. Group Lasso penalization is applied for variable selection, which may improve model estimation/prediction, interpretability, stability, etc. Significantly advancing from the existing literature, the analysis of censored survival data is also considered. We establish the convergence rate for variable selection while considering the approximation error, and obtain a more efficient distribution estimation. Simulations and the analysis of real experimental data demonstrate satisfactory practical utility of the proposed analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. On deep learning as a remedy for the curse of dimensionality in nonparametric regression. The Annals of Statistics, 47(4):2261 – 2285.
  2. Characterizations of łojasiewicz inequalities: subgradient flows, talweg, convexity. Transactions of the American Mathematical Society, 362(6):3319–3363.
  3. Nonlinear variable selection via deep neural networks. Journal of Computational and Graphical Statistics, 30(2):484–492.
  4. Optimal and safe estimation for high-dimensional semi-supervised learning. Journal of the American Statistical Association, 0(0):1–12.
  5. Consistent feature selection for analytic deep neural networks. Advances in Neural Information Processing Systems, 33:2420–2431.
  6. Deep neural networks for estimation and inference. Econometrica, 89(1):181–213.
  7. Sparse-input neural networks for high-dimensional nonparametric regression and classification. arXiv preprint arXiv:1711.07592.
  8. Improved training of wasserstein gans. Advances in neural information processing systems, 30.
  9. Gradient-induced model-free variable selection with composite quantile regression. Statistica Sinica, 28(3):1521–1538.
  10. Variable selection in nonparametric additive models. The Annals of Statistics, 38(4):2282–2313.
  11. Deep nonparametric regression on approximate manifolds: Nonasymptotic error bounds with polynomial prefactors. The Annals of Statistics, 51(2):691 – 716.
  12. Mimic-iii, a freely accessible critical care database. Scientific data, 3(1):1–9.
  13. On the inhibitory effect of albumin on platelet aggregation. Thrombosis research, 17(1):13–18.
  14. Kallenberg, O. (2002). Foundations of Modern Probability. Springer.
  15. Lei, J. (2020). Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces. Bernoulli, 26(1):767 – 798.
  16. Lassonet: A neural network with feature sparsity. The Journal of Machine Learning Research, 22(1):5633–5661.
  17. Better approximations of high dimensional smooth functions by deep neural networks with rectified power units. Communications in Computational Physics.
  18. Powernet: Efficient representations of polynomials and smooth functions by deep neural networks with rectified power units. Journal of Mathematical Study, 53(2):159–191.
  19. Deep feature selection: theory and application to identify enhancers and promoters. Journal of Computational Biology, 23(5):322–336.
  20. Bayesian neural networks for selection of drug sensitive genes. Journal of the American Statistical Association, 113(523):955–972.
  21. Spam: Sparse additive models. In Advances in Neural Information Processing Systems, volume 20.
  22. Wasserstein generative learning of conditional distribution. arXiv, 2112.10039.
  23. Deeppink: reproducible feature selection in deep neural networks. Advances in neural information processing systems, 31.
  24. A universal approximation theorem of deep neural networks for expressing probability distributions. volume 33, pages 3094–3105.
  25. Sparse-input neural network using group concave regularization. arXiv preprint arXiv:2307.00344.
  26. Quantile regression with relu networks: Estimators and minimax rates. Journal of Machine Learning Research, 23:247:1–247:42.
  27. Pham, T. S. (2012). An explicit bound for the Łojasiewicz exponent of real polynomials. Kodai Mathematical Journal, 35(2):311 – 319.
  28. Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proceedings of the National Academy of Sciences, 103(46):17355–17360.
  29. Group sparse regularization for deep neural networks. Neurocomputing, 241:81–89.
  30. Schmidt-Hieber, J. (2020). Nonparametric regression using deep neural networks with relu activation function. Annals of statistics, 48:1875–1897.
  31. Estimation of non-crossing quantile regression process with deep requ neural networks. arXiv preprint arXiv:2207.10442.
  32. Robust nonparametric regression with deep neural networks. arXiv preprint arXiv:2107.10343.
  33. Stute, W. (1996). Distributional convergence under random censorship when covariables are present. Scandinavian Journal of Statistics, 23(4):461–471.
  34. Consistent sparse deep learning: Theory and computation. Journal of the American Statistical Association, 117(540):1981–1995.
  35. Albumin, white blood cell count, and body mass index improve discrimination of mortality in hiv-positive individuals. AIDS (London, England), 33(5):903.
  36. Weak convergence. Springer.
  37. Villani, C. (2009). Optimal Transport: Old and New. Springer.
  38. On the capacity of deep generative networks for approximating distributions. Neural Networks, 145:144–154.
  39. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(1):49–67.
  40. Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2):894 – 942.
  41. Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia, 17(11):1936–1948.
  42. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube