Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Genetic Programming Approach to Designing Convolutional Neural Network Architectures

Published 3 Apr 2017 in cs.NE | (1704.00764v2)

Abstract: The convolutional neural network (CNN), which is one of the deep learning models, has seen much success in a variety of computer vision tasks. However, designing CNN architectures still requires expert knowledge and a lot of trial and error. In this paper, we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP). In our method, we adopt highly functional modules, such as convolutional blocks and tensor concatenation, as the node functions in CGP. The CNN structure and connectivity represented by the CGP encoding method are optimized to maximize the validation accuracy. To evaluate the proposed method, we constructed a CNN architecture for the image classification task with the CIFAR-10 dataset. The experimental result shows that the proposed method can be used to automatically find the competitive CNN architecture compared with state-of-the-art models.

Citations (578)

Summary

  • The paper introduces a genetic programming method that automates CNN architecture design by leveraging Cartesian Genetic Programming to optimize validation accuracy.
  • The method employs functional modules like ConvBlock and ResBlock to reduce the search space and achieve competitive error rates on the CIFAR-10 dataset.
  • The approach demonstrates adaptability by outperforming traditional models in small-data scenarios, highlighting its potential for automated network design.

A Genetic Programming Approach to Designing Convolutional Neural Network Architectures

This paper introduces a novel method for automatic construction of Convolutional Neural Network (CNN) architectures using Cartesian Genetic Programming (CGP). The primary focus is on addressing the challenge of designing CNN architectures, which traditionally requires considerable expert knowledge and iterative trial-and-error processes.

Methodology

The authors employ CGP to represent CNN architectures as directed acyclic graphs. The nodes in CGP consist of highly functional modules like convolutional blocks and tensor concatenation, which significantly reduce the search space while maintaining efficient architecture exploration. The overall objective is to optimize the validation accuracy of the CNN structure using an evolutionary algorithm.

Key elements of the CGP approach include:

  • Convolutional and ResBlock Modules: These modules, characterized by convolutional operations followed by batch normalization and ReLU, serve as nodes, enabling the representation of deep and wide structures.
  • Node Functions: Include ConvBlock, ResBlock, pooling operations, and tensor operations like concatenation and summation.
  • Fitness Evaluation: Validation accuracy on a dataset serves as the fitness measure, guiding the evolutionary process to optimize architectures.

Experimental Evaluation

The method was tested on the CIFAR-10 image classification dataset under two scenarios: a default scenario with the full dataset and a small-data scenario with limited data. The constructed architectures were then compared against state-of-the-art methods and manually designed architectures.

Results:

  • Default Scenario: The proposed method demonstrated competitive error rates compared with well-known models such as VGG and ResNet. CGP-CNN (ResSet) produced an architecture outperforming several hand-crafted models with an impressive balance between accuracy and parameter count.
  • Small-Data Scenario: The CGP-CNN models outperformed VGG and ResNet, showcasing the method's adaptability to data size variance and efficient architecture tuning.

Implications and Future Directions

The proposed genetic programming approach offers a significant step toward the automation of CNN architecture design, potentially reducing the dependency on human expertise while exploring a vast architectural space. The use of highly functional modules and CGP provides a robust framework for evolving deep learning models that can adapt to varying datasets and tasks.

Future developments could focus on reducing computational costs associated with the optimization process. Techniques like progressive data loading or hybrid optimization strategies could be considered. There is also potential for extending this approach to other datasets and application domains, broadening its utility in the AI field.

This research contributes a sophisticated tool for CNN design, with practical applications in scenarios demanding efficient and adaptable neural architectures. The results suggest promising avenues for further advancements in automated neural network design methodologies.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.