Neural Tangents: Fast and Easy Infinite Neural Networks in Python (1912.02803v1)

Published 5 Dec 2019 in stat.ML and cs.LG

Abstract: Neural Tangents is a library designed to enable research into infinite-width neural networks. It provides a high-level API for specifying complex and hierarchical neural network architectures. These networks can then be trained and evaluated either at finite-width as usual or in their infinite-width limit. Infinite-width networks can be trained analytically using exact Bayesian inference or using gradient descent via the Neural Tangent Kernel. Additionally, Neural Tangents provides tools to study gradient descent training dynamics of wide but finite networks in either function space or weight space. The entire library runs out-of-the-box on CPU, GPU, or TPU. All computations can be automatically distributed over multiple accelerators with near-linear scaling in the number of devices. Neural Tangents is available at www.github.com/google/neural-tangents. We also provide an accompanying interactive Colab notebook.

Citations (216)

View on Semantic Scholar

Summary

The paper demonstrates the derivation and analytic computation of infinite-width neural network kernels, including NNGP and NTK.
The paper presents Monte Carlo methods to approximate kernel computations where analytic solutions are impractical.
The paper details gradient descent dynamics and performance comparisons showing infinite networks can excel in data-limited scenarios.

Neural Tangents: An Overview

The paper introduces Neural Tangents, a software library designed to facilitate research on infinite-width neural networks. The library leverages JAX to provide a high-level API, enabling researchers to investigate the behavior of neural networks as they approach infinite width. This approach allows for the exploration of neural network properties that can be complex to study using conventional methods.

Key Contributions

Neural Tangents offers several features that distinguish it from existing libraries:

Analytic Kernels: The library enables the computation of infinite-width Neural Network Gaussian Processes (NNGP) and Neural Tangent Kernels (NTK) analytically. These computations are crucial for understanding the theoretical underpinnings of neural networks in the infinite-width regime.
Monte Carlo Approximations: For architectures where analytic computations are impractical, Neural Tangents provides tools to approximate kernels through Monte Carlo sampling. This method offers flexibility across different neural network libraries.
Gradient Descent Dynamics: The library includes functionalities to model the training dynamics of infinite networks using gradient descent, providing insights into their behavior over time.
CPU, GPU, and TPU Compatibility: Neural Tangents is optimized for performance across various hardware setups. It supports automatic distribution of computations across multiple devices.
Extensible Architecture: The library allows users to define custom layers and architectures, promoting experimentation with novel networks and improving the potential for diverse applications.

Numerical Results and Implications

The study demonstrates that infinite-width networks can match or surpass finite-width networks in certain scenarios, particularly in data-limited contexts. This capability is showcased through experiments on synthetic data and real-world datasets such as CIFAR-10, where differing architectures (fully-connected, convolutional, and WideResNet) were evaluated.

The paper reports strong numerical results indicating that infinite-width models offer near-perfect scaling in computational efficiency when distributed over multiple accelerators. Additionally, the accuracy and training dynamics of these models have been validated against ensembles of finite-width networks.

Theoretical and Practical Implications

Neural Tangents enhances the ability to explore theoretical questions in deep learning by providing an accessible framework for analyzing infinite-width networks. Theoretically, the library supports investigations into Bayesian inference and gradient descent dynamics, advancing understanding in these areas.

Practically, the implications of this work are substantial for model selection and optimization in machine learning. Infinite networks offer an analytical way to predict network behavior without exhaustive experimentation, thus saving computational resources and time.

Future Developments

The paper outlines future directions for enhancing Neural Tangents, including adding more layers and improving computational performance. These improvements aim to enable broader and more complex experiments, helping researchers tackle increasingly challenging AI problems.

Furthermore, the library’s extensibility invites the research community to contribute new layers and functionalities, fostering collaboration and innovation in the field.

Conclusion

Neural Tangents provides a robust toolset for the study of infinite-width neural networks, presenting substantial opportunities for both theoretical exploration and practical application. Its design simplifies the integration of infinite networks into research pipelines, potentially bringing new insights and efficiencies to the study of neural network models. As the research community continues to engage with and expand this library, its impact on the field of AI is expected to grow, enabling more sophisticated and nuanced studies of deep learning phenomena.