Neural Architecture Search using Particle Swarm and Ant Colony Optimization (2403.03781v1)
Abstract: Neural network models have a number of hyperparameters that must be chosen along with their architecture. This can be a heavy burden on a novice user, choosing which architecture and what values to assign to parameters. In most cases, default hyperparameters and architectures are used. Significant improvements to model accuracy can be achieved through the evaluation of multiple architectures. A process known as Neural Architecture Search (NAS) may be applied to automatically evaluate a large number of such architectures. A system integrating open source tools for Neural Architecture Search (OpenNAS), in the classification of images, has been developed as part of this research. OpenNAS takes any dataset of grayscale, or RBG images, and generates Convolutional Neural Network (CNN) architectures based on a range of metaheuristics using either an AutoKeras, a transfer learning or a Swarm Intelligence (SI) approach. Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) are used as the SI algorithms. Furthermore, models developed through such metaheuristics may be combined using stacking ensembles. In the context of this paper, we focus on training and optimizing CNNs using the Swarm Intelligence (SI) components of OpenNAS. Two major types of SI algorithms, namely PSO and ACO, are compared to see which is more effective in generating higher model accuracies. It is shown, with our experimental design, that the PSO algorithm performs better than ACO. The performance improvement of PSO is most notable with a more complex dataset. As a baseline, the performance of fine-tuned pre-trained models is also evaluated.
- L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown, “Auto-weka 2.0: Automatic model selection and hyperparameter optimization in weka,” The Journal of Machine Learning Research, vol. 18, no. 1, pp. 826–830, 2017.
- B. Komer, J. Bergstra, and C. Eliasmith, “Hyperopt-sklearn,” in Automated Machine Learning. Springer, Cham, 2019, pp. 97–111.
- H. Jin, Q. Song, and X. Hu, “Auto-keras: An efficient neural architecture search system,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 1946–1956.
- M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter, “Auto-sklearn: efficient and robust automated machine learning,” in Automated Machine Learning. Springer, Cham, 2019, pp. 113–134.
- M. Feurer, K. Eggensperger, S. Falkner, M. Lindauer, and F. Hutter, “Auto-sklearn 2.0: The next generation,” arXiv preprint arXiv:2007.04074, 2020.
- R. S. Olson and J. H. Moore, “Tpot: A tree-based pipeline optimization tool for automating,” Automated Machine Learning: Methods, Systems, Challenges, p. 151, 2019.
- B. A. Garro and R. A. Vázquez, “Designing artificial neural networks using particle swarm optimization algorithms,” Computational intelligence and neuroscience, vol. 2015, 2015.
- M. Mavrovouniotis and S. Yang, “Training neural networks with ant colony optimization algorithms for pattern classification,” Soft Computing, vol. 19, no. 6, pp. 1511–1522, 2015.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- T. Elsken, J. H. Metzen, F. Hutter et al., “Neural architecture search.” 2019.
- V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
- J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4. IEEE, 1995, pp. 1942–1948.
- R. C. Eberhart and Y. Shi, “Comparison between genetic algorithms and particle swarm optimization,” in International conference on evolutionary programming. Springer, 1998, pp. 611–616.
- M. Dorigo and L. M. Gambardella, “Ant colony system: a cooperative learning approach to the traveling salesman problem,” IEEE Transactions on evolutionary computation, vol. 1, no. 1, pp. 53–66, 1997.
- F. E. F. Junior and G. G. Yen, “Particle swarm optimization of deep neural networks architectures for image classification,” Swarm and Evolutionary Computation, vol. 49, pp. 62–74, 2019.
- E. Byla and W. Pang, “Deepswarm: Optimising convolutional neural networks using swarm intelligence,” in UK Workshop on Computational Intelligence. Springer, 2019, pp. 119–130.
- A. Krizhevsky, V. Nair, and G. Hinton, “The cifar-10 dataset,” online: http://www. cs. toronto. edu/kriz/cifar. html, vol. 55, 2014.
- H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
- E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, “Autoaugment: Learning augmentation strategies from data,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2019, pp. 113–123.
- B. Ma, X. Li, Y. Xia, and Y. Zhang, “Autonomous deep learning: A genetic dcnn designer for image classification,” Neurocomputing, vol. 379, pp. 152–161, 2020.