Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Weight Agnostic Neural Networks (1906.04358v2)

Published 11 Jun 2019 in cs.LG, cs.NE, and stat.ML

Abstract: Not all neural network architectures are created equal, some perform much better than others for certain tasks. But how important are the weight parameters of a neural network compared to its architecture? In this work, we question to what extent neural network architectures alone, without learning any weight parameters, can encode solutions for a given task. We propose a search method for neural network architectures that can already perform a task without any explicit weight training. To evaluate these networks, we populate the connections with a single shared weight parameter sampled from a uniform random distribution, and measure the expected performance. We demonstrate that our method can find minimal neural network architectures that can perform several reinforcement learning tasks without weight training. On a supervised learning domain, we find network architectures that achieve much higher than chance accuracy on MNIST using random weights. Interactive version of this paper at https://weightagnostic.github.io/

Citations (230)

Summary

  • The paper presents a novel neural network design that prioritizes architecture over weight tuning.
  • It employs neuroevolution to search for minimal network structures using a single, shared weight parameter.
  • Empirical results show strong performance on reinforcement learning tasks and MNIST, reducing training complexity.

An Overview of Weight Agnostic Neural Networks

The paper "Weight Agnostic Neural Networks," authored by Adam Gaier and David Ha, introduces a novel approach to the design and evaluation of neural network architectures that prioritize structure over parameter optimization. The approach centers on the concept of Weight Agnostic Neural Networks (WANNs), which are designed to perform specific tasks using a single shared weight parameter, rather than requiring extensive weight training.

Core Contribution

The primary contribution of the work is the development of a search method capable of identifying minimal neural network architectures that inherently perform well on a variety of tasks without conventional weight optimization. This approach leverages a form of neuroevolution, where networks are subjected to evolutionary search procedures that optimize architectures by minimizing reliance on weight parameters.

Methodology

This research introduces a form of architecture search that focuses on topology rather than the optimization of weights. The networks are evaluated using a shared weight parameter sampled from a random distribution, highlighting the innate capability of the architecture itself. By focusing on the structural aspects, the paper challenges the traditional emphasis on weight customization in neural networks.

Results

In empirical experiments, WANNs demonstrated notable performance across several reinforcement learning tasks, such as Bipedal Walker, Car Racing, and the CartPole SwingUp problem. For instance, testing on the MNIST dataset revealed that WANNs could achieve approximately 92% accuracy without explicit weight training, which underscores the potential of WANNs in tasks traditionally dominated by gradient descent methodologies.

Theoretical Implications

The findings shift focus from weight optimization to architecture design, encouraging further examination of network topologies that possess strong inductive biases. This approach aligns with theoretical perspectives on network evolution and minimal description length, calling for additional exploration of new architectural components capable of simplification and generalization across different domains.

Practical Implications

The practical implications are profound, especially in terms of efficiency and computational resource allocation. Without the necessity for extensive weight training, WANNs could reduce the resource- and time-intensive nature of neural network training. Furthermore, WANNs could expedite processes such as few-shot and continual learning, where rapid adaptation and minimal weight adjustments are crucial.

Future Directions

Given the preliminary success of WANNs, the paper lays groundwork for future research directions, advocating for the development of more innate and structurally adept network architectures. Such advancements could extend to configurations that integrate algorithmic information theory, Bayesian neural networks, and insights from biological neural evolution. The exploration of ensemble techniques as a further optimization strategy is also proposed.

In conclusion, this research offers a compelling reevaluation of the role of network architecture in task performance, presenting Weight Agnostic Neural Networks as a promising step towards more efficient and inherently capable neural learning systems. As this field advances, the principles established in this paper will likely inspire ongoing innovations at the intersection of network architecture search and functional efficiency.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com