A least distance estimator for a multivariate regression model using deep neural networks (2401.03123v1)
Abstract: We propose a deep neural network (DNN) based least distance (LD) estimator (DNN-LD) for a multivariate regression problem, addressing the limitations of the conventional methods. Due to the flexibility of a DNN structure, both linear and nonlinear conditional mean functions can be easily modeled, and a multivariate regression model can be realized by simply adding extra nodes at the output layer. The proposed method is more efficient in capturing the dependency structure among responses than the least squares loss, and robust to outliers. In addition, we consider $L_1$-type penalization for variable selection, crucial in analyzing high-dimensional data. Namely, we propose what we call (A)GDNN-LD estimator that enjoys variable selection and model estimation simultaneously, by applying the (adaptive) group Lasso penalty to weight parameters in the DNN structure. For the computation, we propose a quadratic smoothing approximation method to facilitate optimizing the non-smooth objective function based on the least distance loss. The simulation studies and a real data analysis demonstrate the promising performance of the proposed method.
- Bian W, Chen X (2012) Smoothing neural network for constrained non-lipschitz optimization with applications. IEEE transactions on neural networks and learning systems 23(3):399–411 Breiman and Friedman [1997] Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society Series B (Methodological) 59(1):3–54 Cao and Gopaluni [2020] Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society Series B (Methodological) 59(1):3–54 Cao and Gopaluni [2020] Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society Series B (Methodological) 59(1):3–54 Cao and Gopaluni [2020] Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Cao Y, Gopaluni RB (2020) Deep neural network approximation of nonlinear model predictive control. IFAC-PapersOnLine 53(2):11319–11324 Cybenko [1989] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2:303–314 Dinh and Ho [2020] Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Dinh VC, Ho LS (2020) Consistent feature selection for analytic deep neural networks. In: Advances in Neural Information Processing Systems, vol 33. Curran Associates, Inc., pp 2420–2431, URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1959eb9d5a0f7ebc58ebde81d5df400d-Paper.pdf Diskin et al [2017] Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Diskin T, Draskovic G, Pascal F, et al (2017) Deep robust regression. In: 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), IEEE, pp 1–5 Fan and Li [2001] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348–1360 Haldane [1948] Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Haldane JBS (1948) The theory of a cline. Journal of Genetics 48:277–284. 10.1007/BF02986626 Hornik [1991] Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2):251–257. https://doi.org/10.1016/0893-6080(91)90009-T Huber [1992] Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Huber PJ (1992) Robust estimation of a location parameter. In: Breakthroughs in statistics: Methodology and distribution. Springer, p 492–518 I-Cheng [2009] I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- I-Cheng Y (2009) Concrete slump test data. Tech. rep., Department of Information Management, Chung-Hua University, URL https://archive.ics.uci.edu/ml/datasets/concrete+slump+test Imaizumi and Fukumizu [2019] Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Imaizumi M, Fukumizu K (2019) Deep neural networks learn non-smooth functions effectively. In: The 22nd international conference on artificial intelligence and statistics, PMLR, pp 869–878 Jhun and Choi [2009] Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Computational Statistics and Data Analysis 53(12):4221–4227 Jiang et al [2022] Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Jiang C, Jiang C, Chen D, et al (2022) Densely connected neural networks for nonlinear regression. Entropy 24:876. 10.3390/e24070876 Li et al [2017] Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Li F, Zurada J, Liu Y, et al (2017) Input layer regularization of multilayer feedforward neural networks. IEEE Access PP:1–1. 10.1109/ACCESS.2017.2713389 Ochiai et al [2017] Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Ochiai T, Matsuda S, Watanabe H, et al (2017) Automatic node selection for deep neural networks using group lasso regularization. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp 5485–5489 Rumelhart et al [1986] Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536. 10.1038/323533a0 Scardapane et al [2017] Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Scardapane S, Comminiello D, Hussain A, et al (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89 Sohn et al [2012] Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Sohn S, Jung B, Jhun M (2012) Permutation tests using least distance estimator in the multivariate regression model. Computational Statistics - COMPUTATION STAT 27:1–11. 10.1007/s00180-011-0247-3 Sze et al [2017] Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Sze V, Chen YH, Yang TJ, et al (2017) Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12):2295–2329 Szekely et al [2008] Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Szekely G, Rizzo M, Bakirov N (2008) Measuring and testing dependence by correlation of distances. The Annals of Statistics 35. 10.1214/009053607000000505 Tibshirani [1996] Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267–288 Wahba [1992] Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: SANTA FE INSTITUTE STUDIES IN THE SCIENCES OF COMPLEXITY-PROCEEDINGS VOLUME-, Addison-Wesley Publishing Co, pp 95–95 Wang and Leng [2008] Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Wang H, Leng C (2008) A note on adaptive group lasso. Computational Statistics and Data Analysis 52(12):5277–5286 Wang et al [2016] Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Wang J, Qingling C, Chang Q, et al (2016) Convergence analyses on sparse feedforward neural networks via group lasso regularization. Information Sciences 381. 10.1016/j.ins.2016.11.020 Wang and Cao [2022] Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Wang S, Cao G (2022) Robust deep neural network estimation for multi-dimensional functional data. Electronic Journal of Statistics 16(2):6461–6488 Yuan and Lin [2006] Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68(1):49–67 Zhang et al [2019] Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Zhang H, Wang J, Sun Z, et al (2019) Feature selection for neural networks using group lasso regularization. IEEE Transactions on Knowledge and Data Engineering PP:1–1. 10.1109/TKDE.2019.2893266 Zhao et al [2015] Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Zhao L, Hu Q, Wang W (2015) Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Transactions on Multimedia 17(11):1936–1948 Zou [2006] Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429 Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429
- Zou H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):1418–1429