Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Discovering Neural Wirings (1906.00586v5)

Published 3 Jun 2019 in cs.LG, cs.CV, and cs.NE

Abstract: The success of neural networks has driven a shift in focus from feature engineering to architecture engineering. However, successful networks today are constructed using a small and manually defined set of building blocks. Even in methods of neural architecture search (NAS) the network connectivity patterns are largely constrained. In this work we propose a method for discovering neural wirings. We relax the typical notion of layers and instead enable channels to form connections independent of each other. This allows for a much larger space of possible networks. The wiring of our network is not fixed during training -- as we learn the network parameters we also learn the structure itself. Our experiments demonstrate that our learned connectivity outperforms hand engineered and randomly wired networks. By learning the connectivity of MobileNetV1we boost the ImageNet accuracy by 10% at ~41M FLOPs. Moreover, we show that our method generalizes to recurrent and continuous time networks. Our work may also be regarded as unifying core aspects of the neural architecture search problem with sparse neural network learning. As NAS becomes more fine grained, finding a good architecture is akin to finding a sparse subnetwork of the complete graph. Accordingly, DNW provides an effective mechanism for discovering sparse subnetworks of predefined architectures in a single training run. Though we only ever use a small percentage of the weights during the forward pass, we still play the so-called initialization lottery with a combinatorial number of subnetworks. Code and pretrained models are available at https://github.com/allenai/dnw while additional visualizations may be found at https://mitchellnw.github.io/blog/2019/dnw/.

Citations (115)

Summary

  • The paper introduces a dynamic algorithm that simultaneously learns both network weights and structure by evolving channel-level connections during training.
  • It achieves a 10% improvement in ImageNet accuracy with MobileNetV1 using only 41M FLOPs, evidencing enhanced efficiency and performance.
  • The approach expands traditional NAS by exploring a vast connection space, offering insights into sparse, high-performing subnetworks.

Discovering Neural Wirings: A Novel Approach for Neural Architecture Optimization

The paper "Discovering Neural Wirings" addresses a significant challenge in the field of neural networks: the design of network architectures. With the rise of deep learning, there has been a shift from traditional feature engineering to learning features through complex networks. Yet, the architecture of these networks is often constructed using a predefined set of building blocks, which presents limitations in optimizing network performance. This work proposes a method that aims to overcome these limitations by discovering neural wirings, allowing for a broader exploration of potential neural network configurations.

In traditional neural architecture search (NAS) approaches, the space of network topologies is constrained, typically focused on connections between predefined layers or blocks. The proposed method breaks from these constraints by permitting connections at the granularity of individual channels, thereby expanding the search space for network architectures significantly. This method is implemented through a dynamic process where both the network’s parameters and its structure are learned simultaneously during training. The process involves a mechanism where connections, or 'wires', are not fixed but can evolve, forming a vast network of possible subnetwork configurations.

The authors introduce a novel algorithm, termed Discovering Neural Wirings (DNW), which efficiently searches the space of all possible wirings in a neural network. This algorithm operates similarly to a traditional backpropagation method but with a key distinction: it allows gradients to influence the structural configuration of the network. Specifically, during the backward pass, gradients can update potential connections not utilized in the forward pass, facilitating a dynamic reconfiguration of the network.

In empirical evaluations, the DNW method demonstrated its potential by improving the performance of existing architectures. For instance, when applied to MobileNetV1, a significant 10% improvement in ImageNet accuracy was achieved with only 41M FLOPs, showcasing the efficacy of the approach. The method was also tested across different network types, including recurrent and continuous-time networks, demonstrating its versatility and generalization across various models.

The implications of this research are twofold. Practically, it offers a pathway to more efficient networks by identifying and utilizing optimal wiring patterns, significantly enhancing performance metrics without increasing computational load. Theoretically, this work contributes to the understanding of neural architecture design, proposing a view that architecture search involves finding sparse subnetworks within a potentially infinite graph. This perspective aligns with recent explorations in sparse neural network literature, such as the Lottery Ticket Hypothesis, which suggests that dense networks contain sparse subnetworks capable of achieving similar performance.

Future developments stemming from this research could further refine the method, optimizing it for different architectures or exploring applications beyond image classification. Additionally, as the DNW method does not require a complete retraining from scratch, it may serve as a foundation for developing faster and more efficient neural architecture search methodologies.

Overall, "Discovering Neural Wirings" provides a compelling framework for reconsidering how neural networks are constructed, offering promising avenues for both theoretical exploration and substantial practical improvements in neural network performance.