Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions (2403.01326v1)

Published 2 Mar 2024 in cs.CV

Abstract: Neural Architecture Search (NAS), aiming at automatically designing neural architectures by machines, has been considered a key step toward automatic machine learning. One notable NAS branch is the weight-sharing NAS, which significantly improves search efficiency and allows NAS algorithms to run on ordinary computers. Despite receiving high expectations, this category of methods suffers from low search effectiveness. By employing a generalization boundedness tool, we demonstrate that the devil behind this drawback is the untrustworthy architecture rating with the oversized search space of the possible architectures. Addressing this problem, we modularize a large search space into blocks with small search spaces and develop a family of models with the distilling neural architecture (DNA) techniques. These proposed models, namely a DNA family, are capable of resolving multiple dilemmas of the weight-sharing NAS, such as scalability, efficiency, and multi-modal compatibility. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a subsearch space using heuristic algorithms. Moreover, under a certain computational complexity constraint, our method can seek architectures with different depths and widths. Extensive experimental evaluations show that our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively. Additionally, we provide in-depth empirical analysis and insights into neural architecture ratings. Codes available: \url{https://github.com/changlin31/DNA}.

DNA Family: A New Perspective for Enhancing Weight-Sharing NAS with Block-Wise Supervision

Introduction

Neural Architecture Search (NAS) methodologies have been pivotal in automating the design of neural network architectures, a core step towards achieving complete machine-led innovation in model generation. Among the numerous approaches in NAS, weight-sharing NAS has emerged as a promising direction due to its notable efficiency improvements. This technique allows for a significant reduction in computational resource requirements, permitting research endeavors on more modest hardware setups. However, this approach is plagued by a key drawback: the effectiveness of the search process is compromised, primarily due to the unreliable architecture rating, a direct consequence of the overwhelmingly large search space. To tackle this, the introduction of the DNA family, inclusive of the Distilling Neural Architecture (DNA), DNA+, and DNA++, proposes a novel methodology capable of enhancing the weight-sharing NAS framework through block-wise supervision. Through a generalized boundedness analysis, this research underscores the importance of modularizing the search space into smaller blocks to improve search efficiency and reliability.

The Drawback of Weight-Sharing NAS

The core challenge in weight-sharing NAS is identified as the inaccurate architecture rating within a vast search space. The generalization boundedness tool demonstrates that as the search space broadens, the supernet's ability to generalize diminishes, leading to unreliable architecture assessments. This finding specifies that the root of ineffective searches in weight-sharing NAS is primarily the oversized search space. Consequently, the paper suggests a strategic modularization of the search space into blocks as a pivotal solution to this hurdle.

The DNA Family: Addressing NAS Challenges

The proposed solution leverages distilling neural architecture techniques to optimize the block-wise representation of the search space, thereby reducing its size significantly. The DNA family consists of three models, each offering unique advantages in addressing specific issues:

  1. DNA: Utilizes traditional supervised learning with distillation techniques to efficiently train multiple student supernets simultaneously.
  2. DNA+: Incorporates a progressive learning approach that keeps updating the teacher network, offering better scalability by adapting to the improved capacity of the search model iteratively.
  3. DNA++: Embraces self-supervised learning to optimize the teacher network and student supernets jointly, allowing a more versatile compatibility across various architectural designs.

The modularization of the search space permits these models to evaluate all candidate architectures, overcoming the limitations of previous methods that only explored sub-search spaces.

Empirical Analysis and Insights

The DNA family has been extensively evaluated against state-of-the-art benchmarks, showcasing impressive performance improvements, especially in the context of mobile convolutional networks and small vision transformers. Moreover, this work delves deeper into the field of neural architecture ratings, presenting a comprehensive analysis that highlights the cause of inefficiencies in conventional weight-sharing NAS approaches.

Future Directions in AI and NAS

The introduction of the DNA family not only marks a significant stride in NAS research but also opens avenues for future developments in AI. By addressing the scalability and effectiveness of the architecture search process, this method paves the way for more nuanced and efficient machine learning models across various domains. The synthesis of block-wise supervision with distillation techniques presents a robust framework for enhancing model discovery without compromising on computational efficiency.

In essence, the DNA family embodies a strategic breakthrough in the continual quest for automated machine learning, promising a new era of innovation and efficiency in model design.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (104)
  1. P. Ren, Y. Xiao, X. Chang, P.-Y. Huang, Z. Li, X. Chen, and X. Wang, “A comprehensive survey of neural architecture search: Challenges and solutions,” ACM Computing Surveys, 2021. [Online]. Available: https://dx.doi.org/10.1145/3447582
  2. M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, “Mnasnet: Platform-aware neural architecture search for mobile,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 2019, pp. 2820–2828.
  3. M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, 2019, pp. 6105–6114.
  4. Y. Chen, G. Meng, Q. Zhang, S. Xiang, C. Huang, L. Mu, and X. Wang, “RENAS: reinforced evolutionary neural architecture search,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 2019, pp. 4787–4796.
  5. R. Negrinho and G. J. Gordon, “Deeparchitect: Automatically designing and training deep architectures,” CoRR, vol. abs/1704.08792, 2017.
  6. K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. P. Xing, “Neural architecture search with bayesian optimisation and optimal transport,” in Advances in neural information processing systems, vol. 31, 2018.
  7. C. White, W. Neiswanger, and Y. Savani, “BANANAS: bayesian optimization with neural architectures for neural architecture search,” in Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021.   AAAI Press, 2021, pp. 10 293–10 301.
  8. C. Sciuto, K. Yu, M. Jaggi, C. Musat, and M. Salzmann, “Evaluating the search phase of neural architecture search,” in 8th International Conference on Learning Representations, ICLR 2020, 2020. [Online]. Available: https://openreview.net/forum?id=H1loF2NFwr
  9. A. Yang, P. M. Esperança, and F. M. Carlucci, “{NAS} evaluation is frustratingly hard,” in 8th International Conference on Learning Representations, ICLR 2020,, 2020. [Online]. Available: https://openreview.net/forum?id=HygrdpVKvr
  10. J. Peng, J. Zhang, C. Li, G. Wang, X. Liang, and L. Lin, “Pi-nas: Improving neural architecture search by reducing supernet training consistency shift,” in IEEE International Conference on Computer Vision, ICCV 2021.   IEEE Computer Society, 2021.
  11. G. E. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” CoRR, vol. abs/1503.02531, 2015. [Online]. Available: http://arxiv.org/abs/1503.02531
  12. B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
  13. Z. Zhong, J. Yan, W. Wu, J. Shao, and C. Liu, “Practical block-wise neural network architecture generation,” in 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, 2018, pp. 2423–2432.
  14. B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neural network architectures using reinforcement learning,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
  15. H. Cai, L. Zhu, and S. Han, “Proxylessnas: Direct neural architecture search on target task and hardware,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
  16. H. Liu, K. Simonyan, and Y. Yang, “DARTS: differentiable architecture search,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
  17. X. Dong and Y. Yang, “Searching for a robust neural architecture in four GPU hours,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 2019, pp. 1761–1770.
  18. A. Brock, T. Lim, J. M. Ritchie, and N. Weston, “SMASH: one-shot model architecture search through hypernetworks,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, 2018.
  19. B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, “Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 2019, pp. 10 734–10 742.
  20. Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun, “Single path one-shot neural architecture search with uniform sampling,” CoRR, vol. abs/1904.00420, 2019.
  21. X. Chu, B. Zhang, R. Xu, and J. Li, “Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search,” CoRR, vol. abs/1907.01845, 2019.
  22. G. Bender, P. Kindermans, B. Zoph, V. Vasudevan, and Q. V. Le, “Understanding and simplifying one-shot architecture search,” in Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, 2018, pp. 549–558.
  23. X. Li, C. Lin, C. Li, M. Sun, W. Wu, J. Yan, and W. Ouyang, “Improving one-shot NAS by suppressing the posterior fading,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020.   Computer Vision Foundation / IEEE, 2020, pp. 13 833–13 842.
  24. J. Baxter, “A bayesian/information theoretic model of learning to learn via multiple task sampling,” Mach. Learn., vol. 28, no. 1, pp. 7–39, 1997.
  25. C. Li, J. Peng, L. Yuan, G. Wang, X. Liang, L. Lin, and X. Chang, “Block-wisely supervised neural architecture search with knowledge distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1989–1998.
  26. J. Ba and R. Caruana, “Do deep nets really need to be deep?” in Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, 2014, pp. 2654–2662.
  27. A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “Fitnets: Hints for thin deep nets,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  28. J. Yim, D. Joo, J. Bae, and J. Kim, “A gift from knowledge distillation: Fast optimization, network minimization and transfer learning,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 2017, pp. 7130–7138.
  29. H. Wang, H. Zhao, X. Li, and X. Tan, “Progressive blockwise knowledge distillation for neural network acceleration,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, 2018, pp. 2769–2775.
  30. T. Furlanello, Z. Lipton, M. Tschannen, L. Itti, and A. Anandkumar, “Born again neural networks,” in International Conference on Machine Learning, 2018, pp. 1607–1616.
  31. H. Mobahi, M. Farajtabar, and P. L. Bartlett, “Self-distillation amplifies regularization in hilbert space,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
  32. Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, “Self-training with noisy student improves imagenet classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 687–10 698.
  33. C. Liu, P. Dollár, K. He, R. B. Girshick, A. L. Yuille, and S. Xie, “Are labels necessary for neural architecture search?” in Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IV, ser. Lecture Notes in Computer Science, vol. 12349.   Springer, 2020, pp. 798–813.
  34. X. Zhang, P. Hou, X. Zhang, and J. Sun, “Neural architecture search with random labels,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021.   Computer Vision Foundation / IEEE, 2021, pp. 10 907–10 916.
  35. S. Yan, Y. Zheng, W. Ao, X. Zeng, and M. Zhang, “Does unsupervised architecture representation learning help neural architecture search?” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  36. G. Wang, L. Lin, R. Chen, G. Wang, and J. Zhang, “Joint learning of neural transfer and architecture adaptation for image recognition,” IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 2021.
  37. C. Li, T. Tang, G. Wang, J. Peng, B. Wang, X. Liang, and X. Chang, “BossNAS: Exploring hybrid cnn-transformers with block-wisely self-supervised neural architecture search,” in IEEE International Conference on Computer Vision, ICCV 2021.   IEEE Computer Society, 2021.
  38. B. Chen, P. Li, C. Li, B. Li, L. Bai, C. Lin, M. Sun, J. Yan, and W. Ouyang, “Glit: Neural architecture search for global and local image transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12–21.
  39. X. Su, S. You, J. Xie, M. Zheng, F. Wang, C. Qian, C. Zhang, X. Wang, and C. Xu, “Vitas: Vision transformer architecture search,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXI.   Springer, 2022, pp. 139–157.
  40. M. Chen, H. Peng, J. Fu, and H. Ling, “Autoformer: Searching transformers for visual recognition,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 270–12 280.
  41. J. Yu, P. Jin, H. Liu, G. Bender, P.-J. Kindermans, M. Tan, T. Huang, X. Song, R. Pang, and Q. Le, “Bignas: Scaling up neural architecture search with big single-stage models,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16.   Springer, 2020, pp. 702–717.
  42. C. Gong, D. Wang, M. Li, X. Chen, Z. Yan, Y. Tian, V. Chandra et al., “Nasvit: Neural architecture search for efficient vision transformers with gradient conflict aware supernet training,” in International Conference on Learning Representations, 2021.
  43. M. Chen, K. Wu, B. Ni, H. Peng, B. Liu, J. Fu, H. Chao, and H. Ling, “Searching the search space of vision transformer,” Advances in Neural Information Processing Systems, vol. 34, pp. 8714–8726, 2021.
  44. H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficient neural architecture search via parameter sharing,” ArXiv, vol. abs/1802.03268, 2018.
  45. D. H. Ballard, “Modular learning in neural networks,” in Proceedings of the sixth National Conference on artificial intelligence-volume 1, 1987, pp. 279–284.
  46. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, 2017, pp. 5998–6008.
  47. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.
  48. C. Dong, G. Wang, H. Xu, J. Peng, X. Ren, and X. Liang, “Efficientbert: Progressively searching multilayer perceptron via warm-up knowledge distillation,” in Findings of Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, 2021.
  49. X. Chu, B. Zhang, J. Li, Q. Li, and R. Xu, “Scarletnas: Bridging the gap between scalability and fairness in neural architecture search,” CoRR, vol. abs/1908.06022, 2019.
  50. F. Liang, C. Lin, R. Guo, M. Sun, W. Wu, J. Yan, and W. Ouyang, “Computation reallocation for object detection,” in 8th International Conference on Learning Representations, ICLR 2020, 2020. [Online]. Available: https://openreview.net/forum?id=SkxLFaNKwB
  51. T. Chen, S. Kornblith, M. Norouzi, and G. E. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, ser. Proceedings of Machine Learning Research, vol. 119.   PMLR, 2020, pp. 1597–1607.
  52. J. Grill, F. Strub, F. Altché, C. Tallec, P. H. Richemond, E. Buchatskaya, C. Doersch, B. Á. Pires, Z. Guo, M. G. Azar, B. Piot, K. Kavukcuoglu, R. Munos, and M. Valko, “Bootstrap your own latent - A new approach to self-supervised learning,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  53. G. Wang, K. Wang, G. Wang, P. H. S. Torr, and L. Lin, “Solving inefficiency of self-supervised representation learning,” in IEEE International Conference on Computer Vision, ICCV 2021.   IEEE Computer Society, 2021.
  54. G. Wang, Y. Tang, L. Lin, and P. H. Torr, “Semantic-aware auto-encoders for self-supervised representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 9664–9675.
  55. X. Chen and K. He, “Exploring simple siamese representation learning,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021.   Computer Vision Foundation / IEEE, 2021, pp. 15 750–15 758.
  56. A. Bardes, J. Ponce, and Y. LeCun, “Vicreg: Variance-invariance-covariance regularization for self-supervised learning,” CoRR, vol. abs/2105.04906, 2021.
  57. J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” in Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, ser. Proceedings of Machine Learning Research, vol. 139.   PMLR, 2021, pp. 12 310–12 320.
  58. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, 2009, pp. 248–255.
  59. A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Master’s thesis, Department of Computer Science, University of Toronto, 2009.
  60. M. Everingham, S. M. A. Eslami, L. V. Gool, C. K. I. Williams, J. M. Winn, and A. Zisserman, “The pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis., vol. 111, no. 1, pp. 98–136, 2015.
  61. B. Zhou, H. Zhao, X. Puig, T. Xiao, S. Fidler, A. Barriuso, and A. Torralba, “Semantic understanding of scenes through the ade20k dataset,” International Journal of Computer Vision, vol. 127, pp. 302–321, 2019.
  62. T. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: common objects in context,” in Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, 2014, pp. 740–755.
  63. M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, 2018, pp. 4510–4520.
  64. X. Chu, B. Zhang, and R. Xu, “Moga: Searching beyond mobilenetv3,” CoRR, vol. abs/1908.01314, 2019. [Online]. Available: http://arxiv.org/abs/1908.01314
  65. S. Xie, R. B. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 2017, pp. 5987–5995.
  66. H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” in ICML, 2021.
  67. J. Yu, L. Yang, N. Xu, J. Yang, and T. Huang, “Slimmable neural networks,” in International Conference on Learning Representations, 2018.
  68. C. Li, G. Wang, B. Wang, X. Liang, Z. Li, and X. Chang, “Dynamic slimmable network,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2021, pp. 8607–8617.
  69. Z. Jiang, C. Li, X. Chang, L. Chen, J. Zhu, and Y. Yang, “Dynamic slimmable denoising network,” IEEE Transactions on Image Processing, vol. 32, pp. 1583–1598, 2023.
  70. C. Li, G. Wang, B. Wang, X. Liang, Z. Li, and X. Chang, “Ds-net++: Dynamic weight slicing for efficient inference in cnns and transformers,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022.
  71. E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical data augmentation with no separate search,” arXiv preprint arXiv:1909.13719, 2019.
  72. A. Howard, M. Sandler, G. Chu, L. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, “Searching for mobilenetv3,” CoRR, vol. abs/1905.02244, 2019. [Online]. Available: http://arxiv.org/abs/1905.02244
  73. M. Tan and Q. V. Le, “Mixconv: Mixed depthwise convolutional kernels,” CoRR, vol. abs/1907.09595, 2019. [Online]. Available: http://arxiv.org/abs/1907.09595
  74. X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda et al., “Fbnetv3: Joint architecture-recipe search using predictor pretraining,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16 276–16 285.
  75. D. Wang, M. Li, C. Gong, and V. Chandra, “Attentivenas: Improving neural architecture search via attentive sampling,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 6418–6427.
  76. D. Wang, C. Gong, M. Li, Q. Liu, and V. Chandra, “Alphanet: Improved training of supernets with alpha-divergence,” in International Conference on Machine Learning.   PMLR, 2021, pp. 10 760–10 771.
  77. M. Tan and Q. Le, “Efficientnetv2: Smaller models and faster training,” in International conference on machine learning.   PMLR, 2021, pp. 10 096–10 106.
  78. H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” arXiv preprint arXiv:1908.09791, 2019.
  79. S. Kornblith, J. Shlens, and Q. V. Le, “Do better imagenet models transfer better?” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2656–2666, 2018.
  80. Y. Huang, Y. Cheng, D. Chen, H. Lee, J. Ngiam, Q. V. Le, and Z. Chen, “Gpipe: Efficient training of giant neural networks using pipeline parallelism,” in NeurIPS, 2019.
  81. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in ICLR, 2021.
  82. L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, F. E. Tay, J. Feng, and S. Yan, “Tokens-to-token vit: Training vision transformers from scratch on imagenet,” in ICCV, 2021.
  83. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in ICCV, 2021.
  84. X. Chu, Z. Tian, Y. Wang, B. Zhang, H. Ren, X. Wei, H. Xia, and C. Shen, “Twins: Revisiting the design of spatial attention in vision transformers,” Advances in Neural Information Processing Systems, vol. 34, pp. 9355–9366, 2021.
  85. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in CVPR, 2016.
  86. G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger, “Deep networks with stochastic depth,” in ECCV, 2016.
  87. S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy to train strong classifiers with localizable features,” in ICCV, 2019.
  88. E. D. Cubuk, B. Zoph, J. Shlens, and Q. Le, “Randaugment: Practical automated data augmentation with a reduced search space,” in NeurIPS, 2020.
  89. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, “Random erasing data augmentation,” in AAAI, 2020.
  90. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in ICLR, 2018.
  91. T. E. Arber Zela, T. Saikia, Y. Marrakchi, T. Brox, and F. Hutter, “Understanding and robustifying differentiable architecture search,” International Conference on Learning Representations (ICLR), 2020.
  92. M. G. Kendall, “A new measure of rank correlation,” Biometrika, vol. 30, no. 1/2, pp. 81–93, 1938.
  93. B. Hariharan, P. Arbelaez, L. D. Bourdev, S. Maji, and J. Malik, “Semantic contours from inverse detectors,” in IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6-13, 2011, 2011, pp. 991–998.
  94. L. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VII, 2018, pp. 833–851.
  95. C. Sun, A. Shrivastava, S. Singh, and A. Gupta, “Revisiting unreasonable effectiveness of data in deep learning era,” in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 2017, pp. 843–852.
  96. B. Wu, C. Li, H. Zhang, X. Dai, P. Zhang, M. Yu, J. Wang, Y. Lin, and P. Vajda, “Fbnetv5: Neural architecture search for multiple tasks in one run,” arXiv preprint arXiv:2111.10007, 2021.
  97. L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587, 2017.
  98. G. Wang, P. Wang, Z. Chen, W. Wang, C. C. Loy, and Z. Liu, “Perf: Panoramic neural radiance field from a single panorama,” arXiv preprint arXiv:2310.16831, 2023.
  99. G. Wang, Z. Chen, C. C. Loy, and Z. Liu, “Sparsenerf: Distilling depth ranking for few-shot novel view synthesis,” arXiv preprint arXiv:2303.16196, 2023.
  100. Z. Chen, G. Wang, and Z. Liu, “Scenedreamer: Unbounded 3d scene generation from 2d image collections,” arXiv preprint arXiv:2302.01330, 2023.
  101. G. Wang and P. H. Torr, “Traditional classification neural networks are good generators: They are competitive with ddpms and gans,” arXiv preprint arXiv:2211.14794, 2022.
  102. G. Wang, Y. Yang, C. C. Loy, and Z. Liu, “Stylelight: Hdr panorama generation for lighting estimation and editing,” in European Conference on Computer Vision.   Springer, 2022, pp. 477–492.
  103. Z. Chen, G. Wang, and Z. Liu, “Text2light: Zero-shot text-driven hdr panorama generation,” ACM Transactions on Graphics (TOG), vol. 41, no. 6, pp. 1–16, 2022.
  104. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Guangrun Wang (43 papers)
  2. Changlin Li (28 papers)
  3. Liuchun Yuan (5 papers)
  4. Jiefeng Peng (8 papers)
  5. Xiaoyu Xian (10 papers)
  6. Xiaodan Liang (318 papers)
  7. Xiaojun Chang (148 papers)
  8. Liang Lin (318 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets