Papers
Topics
Authors
Recent
Search
2000 character limit reached

Path-Level Network Transformation for Efficient Architecture Search

Published 7 Jun 2018 in cs.LG, cs.AI, and stat.ML | (1806.02639v1)

Abstract: We introduce a new function-preserving transformation for efficient neural architecture search. This network transformation allows reusing previously trained networks and existing successful architectures that improves sample efficiency. We aim to address the limitation of current network transformation operations that can only perform layer-level architecture modifications, such as adding (pruning) filters or inserting (removing) a layer, which fails to change the topology of connection paths. Our proposed path-level transformation operations enable the meta-controller to modify the path topology of the given network while keeping the merits of reusing weights, and thus allow efficiently designing effective structures with complex path topologies like Inception models. We further propose a bidirectional tree-structured reinforcement learning meta-controller to explore a simple yet highly expressive tree-structured architecture space that can be viewed as a generalization of multi-branch architectures. We experimented on the image classification datasets with limited computational resources (about 200 GPU-hours), where we observed improved parameter efficiency and better test results (97.70% test accuracy on CIFAR-10 with 14.3M parameters and 74.6% top-1 accuracy on ImageNet in the mobile setting), demonstrating the effectiveness and transferability of our designed architectures.

Citations (203)

Summary

  • The paper introduces path-level transformations that preserve pre-trained weights while modifying network topologies for efficient architecture search.
  • It integrates these transformations into a bidirectional tree-structured RL framework to explore a multi-branch architecture space and achieve competitive performance on CIFAR-10 and ImageNet.
  • The approach demonstrates parameter efficiency and transferability, requiring only 200 GPU-hours compared to thousands used in previous NAS methods.

The paper presents a novel approach to neural architecture search (NAS) by proposing a method termed Path-Level Network Transformation. This innovation specifically targets the limitations of traditional layer-level transformations by focusing on path-level topological modifications, which enables more efficient architecture search while maintaining the ability to reuse pre-trained network weights. The study integrates this method into a reinforcement learning framework to explore a tree-structured architecture space effectively.

Methodology Summary

Path-Level Network Transformation: The central innovation of the paper lies in the function-preserving network transformation at the path level. These operations extend the scope beyond layer-wise modifications, allowing the transformation of network path topology while preserving pre-trained weights. This is crucial for complex architectures like Inception models where multi-path connections are prevalent.

Reinforcement Learning Framework: The transformation operations are integrated with a bidirectional tree-structured reinforcement learning (RL) meta-controller. This setup exploits a tree-structured architecture space, providing a generalized view of multi-branch structures. The RL meta-controller is responsible for exploring this search space, dynamically sampling architectures, and evaluating their performance. The use of tree-structured LSTMs facilitates the encoding of input architectures in a manner that naturally corresponds to the hierarchical nature of network topologies.

Experimental Results

The paper reports empirical evaluations primarily conducted on CIFAR-10 and ImageNet datasets, showcasing significant improvements in architecture search efficiency and model performance. With restricted computational resources (approximately 200 GPU-hours), the architecture discovered using the proposed method achieved competitive accuracy on CIFAR-10 — 97.70% with 14.3M parameters and 74.6% top-1 accuracy on ImageNet in a mobile setting. Notably, these results are achieved while using a fraction of the computational resources required by other NAS approaches, such as those reported by Zoph et al., which utilized 48,000 GPU-hours.

Implications and Future Directions

Parameter Efficiency and Transferability: The capability to discover architectures with high parameter efficiency was demonstrated by improvements over existing DenseNets and PyramidNets. The architecture has shown enhanced effectiveness and, more importantly, transferability across different models, underscoring the generality of the path-level transformations.

Theoretical Implications: Theoretical implications include the broadening of architecture search spaces to include diverse path topologies. This empowers the NAS framework to explore beyond traditional chain-structured networks, which could lead to discovering novel architectural insights.

Future Developments: The fusion of the proposed transformation framework with network compression techniques holds potential for further advancements. Future work could explore reducing model complexity without sacrificing performance, which is beneficial for deploying NAS-derived models in resource-constrained environments.

In conclusion, the paper provides an exciting advancement in the development of NAS techniques, specifically highlighting the significance of path-level transformations. By leveraging a tree-structured representation and bidirectional RL controllers, the proposed approach enhances both the efficiency and quality of neural architecture design, setting a promising foundation for future research in automated model development.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.