TND-NAS: Towards Non-differentiable Objectives in Progressive Differentiable NAS Framework (2111.03892v4)
Abstract: Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS) for its high efficiency compared with the early NAS methods. Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption. However, these methods are no longer naturally capable of tackling the non-differentiable objectives, e.g., energy, resource-constrained efficiency, and other metrics, let alone the multi-objective search demands. Researches in the multi-objective NAS field target this but requires vast computational resources cause of the sole optimization of each candidate architecture. In light of this discrepancy, we propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS. Under the differentiable NAS framework, with the continuous relaxation of the search space, TND-NAS has the architecture parameters been optimized in discrete space, while resorting to the progressive search space shrinking by architecture parameters. Our representative experiment takes two objectives (Parameters, Accuracy) as an example, we achieve a series of high-performance compact architectures on CIFAR10 (1.09M/3.3%, 2.4M/2.95%, 9.57M/2.54%) and CIFAR100 (2.46M/18.3%, 5.46/16.73%, 12.88/15.20%) datasets. Favorably, compared with other multi-objective NAS methods, TND-NAS is less time-consuming (1.3 GPU-days on NVIDIA 1080Ti, 1/6 of that in NSGA-Net), and can be conveniently adapted to real-world NAS scenarios (resource-constrained, platform-specialized).
- Y. Guo, Y. Luo, Z. He, J. Huang, and J. Chen, “Hierarchical neural architecture search for single image super-resolution,” IEEE Signal Processing Letters, vol. 27, pp. 1255–1259, 2020.
- D. Stamoulis, R. Ding, D. Wang, D. Lymberopoulos, B. Priyantha, J. Liu, and D. Marculescu, “Single-path mobile automl: Efficient convnet design and nas hyperparameter optimization,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 4, pp. 609–622, 2020.
- X. He, K. Zhao, and X. Chu, “Automl: A survey of the state-of-the-art,” Knowledge-Based Systems, vol. 212, p. 106622, 2021.
- B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in International Conference on Learning Representations, 2017, pp. 1–16.
- X. Zheng, R. Ji, L. Tang, B. Zhang, J. Liu, and Q. Tian, “Multinomial distribution learning for effective neural architecture search,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1304–1313.
- B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697–8710.
- H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficient neural architecture search via parameter sharing,” in Proceedings of the 35th International Conference on Machine Learning, 2018, pp. 4092–4101.
- H. Liu, K. Simonyan, and Y. Yang, “Darts: Differentiable architecture search,” in International Conference on Learning Representations, 2019, pp. 4561–4574.
- A. Brock, T. Lim, J. M. Ritchie, and N. Weston, “Smash: One-shot model architecture search through hypernetworks,” in Proceedings of the international conference on learning representations, 2018, pp. 1–22.
- G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. Le, “Understanding and simplifying one-shot architecture search,” in International Conference on Machine Learning. PMLR, 2018, pp. 550–559.
- B. Lyu, Y. Yang, S. Wen, T. Huang, and K. Li, “Neural architecture search for portrait parsing,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
- M. Tan, B. Chen, R. Pang, V. K. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, “Mnasnet:platform-aware neural architecture search for mobile,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 2820–2828.
- J.-D. Dong, A.-C. Cheng, D.-C. Juan, W. Wei, and M. Sun, “Dpp-net: Device-aware progressive search for pareto-optimal neural architectures,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 540–555.
- C.-H. Hsu, S.-C. Chang, J.-H. Liang, H.-P. Chou, C.-H. Liu, S.-H. Chang, T. Pan, Y.-T. Chen, W. Wei, and D.-C. Juan, “Monas: Multi-objective neural architecture search using reinforcement learning,” arXiv preprint arXiv:1806.10332, 2018.
- F. H. Thomas Elsken, Jan Hendrik Metzen, “Multi-objective architecture search for cnns,” arXiv preprint arXiv:1804.09081, 2018.
- B. Lyu, S. Wen, K. Shi, and T. Huang, “Multiobjective reinforcement learning-based neural architecture search for efficient portrait parsing,” IEEE Transactions on Cybernetics, 2021.
- Z. Lu, I. Whalen, V. Boddeti, Y. Dhebar, K. Deb, E. Goodman, and W. Banzhaf, “Nsga-net: neural architecture search using multi-objective genetic algorithm,” in Proceedings of the Genetic and Evolutionary Computation Conference, 2019, pp. 419–427.
- X. Chen, L. Xie, J. Wu, and Q. Tian, “Progressive differentiable architecture search: Bridging the depth gap between search and evaluation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1294–1303.
- P. J. Angeline, G. M. Saunders, and J. B. Pollack, “An evolutionary algorithm that constructs recurrent neural networks,” IEEE transactions on Neural Networks, vol. 5, no. 1, pp. 54–65, 1994.
- K. O. Stanley and R. Miikkulainen, “Evolving neural networks through augmenting topologies,” Evolutionary computation, vol. 10, no. 2, pp. 99–127, 2002.
- Y. Sun, B. Xue, M. Zhang, and G. G. Yen, “Evolving deep convolutional neural networks for image classification,” IEEE Transactions on Evolutionary Computation, vol. 24, no. 2, pp. 394–407, 2019.
- Y. Sun, B. Xue, M. Zhang, G. G. Yen, and J. Lv, “Automatically designing cnn architectures using the genetic algorithm for image classification,” IEEE transactions on cybernetics, vol. 50, no. 9, pp. 3840–3854, 2020.
- E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolution for image classifier architecture search,” in Proceedings of the aaai conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4780–4789.
- H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu, “Hierarchical representations for efficient architecture search,” in International Conference on Learning Representations, 2017, pp. 1–13.
- Y. Xu, L. Xie, W. Dai, X. Zhang, X. Chen, G.-J. Qi, H. Xiong, and Q. Tian, “Partially-connected neural architecture search for reduced computational redundancy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
- L. Xie, X. Chen, K. Bi, L. Wei, Y. Xu, L. Wang, Z. Chen, A. Xiao, J. Chang, X. Zhang et al., “Weight-sharing neural architecture search: A battle to shrink the optimization gap,” ACM Computing Surveys (CSUR), vol. 54, no. 9, pp. 1–37, 2021.
- H. Cai, L. Zhu, and S. Han, “Proxylessnas: Direct neural architecture search on target task and hardware,” in International Conference on Learning Representations, 2019, pp. 1–13.
- A. Cheng, J. Dong, C. Hsu, S. Chang, M. Sun, S. Chang, J. Pan, Y. Chen, W. Wei, and D. Juan, “Searching toward pareto-optimal device-aware neural architectures,” p. 136, 2018.
- T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objective neural architecture search via lamarckian evolution,” 2019.
- B. Lyu, H. Yuan, L. Lu, and Y. Zhang, “Resource-constrained neural architecture search on edge devices,” IEEE Transactions on Network Science and Engineering, pp. 1–1, 2021.
- L. Lu and B. Lyu, “Reducing energy consumption of neural architecture search: An inference latency prediction framework,” Sustainable Cities and Society, vol. 67, p. 102747, 2021.
- Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun, “Single path one-shot neural architecture search with uniform sampling,” in Proceedings of the European conference on computer vision, 2019, pp. 544–560.
- R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3, pp. 229–256, 1992.
- Z. Gábor, Z. Kalmár, and C. Szepesvári, “Multi-criteria reinforcement learning,” in Proceedings of the 15th International Conference on Machine Learning, 1998, pp. 197–205.
- S. Mannor and N. Shimkin, “A geometric approach to multi-criterion reinforcement learning,” Journal of Machine Learning Research, pp. 325–360, 2004.
- V. K. Moffaert, M. M. Drugan, and A. Nowé, “Scalarized multi-objective reinforcement learning: Novel design techniques,” ADPRL, pp. 191–199, 2013.
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.
- B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, “Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 734–10 742.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
- C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecture search,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 19–34.
- S. Xie, H. Zheng, C. Liu, and L. Lin, “Snas: stochastic neural architecture search,” arXiv preprint arXiv:1812.09926, 2018.
- Bo Lyu (6 papers)
- Shiping Wen (17 papers)