Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 89 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 29 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 98 tok/s Pro

GPT OSS 120B 424 tok/s Pro

Kimi K2 164 tok/s Pro

2000 character limit reached

TASI Lectures on Physics for Machine Learning (2408.00082v1)

Published 31 Jul 2024 in hep-th, cs.LG, and hep-ph

Abstract: These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the universal approximation theorem and neural network / Gaussian process correspondence, and also more recent results such as the neural tangent kernel, feature learning with the maximal update parameterization, and Kolmogorov-Arnold networks. The exposition on neural network theory emphasizes a field theoretic perspective familiar to theoretical physicists. I elaborate on connections between the two, including a neural network approach to field theory.

References (62)

Collections

Summary

The paper establishes a novel NN-FT correspondence that frames neural networks as defined field theories to bridge AI and physics.
It details methodologies like the Universal Approximation Theorem, Kolmogorov-Arnold networks, NNGP, and NTK to analyze network expressivity and dynamics.
The lectures propose innovative strategies, including maximal update parameterization, to enhance feature learning and model optimization.

Overview of "TASI Lectures on Physics for Machine Learning"

The lecture notes titled "TASI Lectures on Physics for Machine Learning" by Jim Halverson, address the intersection of neural networks and theoretical physics. These notes are structured around three core themes: expressivity, statistics, and dynamics of neural networks, all examined through a field-theoretic lens. The lectures aim not only to explore neural networks from a theoretical physics perspective but also to elucidate how these concepts can inform and transform the understanding of field theory.

Expressivity of Neural Networks

The discussion on expressivity revolves around the capabilities of neural networks to approximate any function. The Universal Approximation Theorem (UAT) is presented as a cornerstone result, establishing that networks with a single hidden layer can approximate any continuous function on a compact domain, albeit without guaranteeing the ability to easily find such approximators in practice.

The notes further introduce the Kolmogorov-Arnold representation theorem as an alternative perspective on function representation, inspiring the development of the Kolmogorov-Arnold networks (KAN). These offer a new architectural approach that places activation functions on network edges rather than nodes, showcasing potential for symbolic representation and enhanced interpretability.

Statistics of Neural Networks

In examining neural network ensembles, the lectures explain how, in the infinite width limit, networks adhere to the Neural Network-Gaussian Process (NNGP) correspondence, where they resemble Gaussian processes due to the Central Limit Theorem (CLT). This insight reveals that non-Gaussian processes in networks derive from finite-width corrections or from statistical dependence among parameters, thus introducing interactions analogous to those in field theory.

The notes also explore the introduction of symmetries within neural networks, showing they can lead to invariant statistical ensembles, akin to global symmetries in field theory. Concrete examples, such as Euclidean-invariant networks, demonstrate practical implementations of these concepts.

Dynamics of Neural Networks

The dynamics section gives prominence to the Neural Tangent Kernel (NTK), which simplifies network training dynamics in the infinite-width limit by fixing gradients with respect to initial parameter values. While this "lazy training" regime explains learning dynamics in some scenarios, it also highlights a limitation—specifically, the lack of meaningful feature learning within hidden layers.

To address this, the lecture notes discuss feature learning via a scaling analysis that ensures meaningful evolution of network features during training. The maximal update parameterization ( $\mu P$ ) exemplifies a designed approach to maintain non-trivial feature learning, showcasing a route beyond the NTK’s static dynamics.

Neural Networks and Field Theory

The culmination of these lectures is the proposal of the NN-FT correspondence, framing neural networks as defined field theories where parameters shape function spaces analogous to fields. These notes propose that by understanding neural networks through this lens, one can explore novel field theories or even non-traditional interactions, potentially offering new angles to approach quantum field theories.

Implications and Future Directions

The lectures postulate that improved theoretical understanding of neural networks could drive developments reminiscent of the personal computer revolution, where large models could be reduced to manageable scales without sacrificing capability. Furthermore, integrating principles of expressivity, dynamics, and architectural innovation might yield optimal learning algorithms.

The ongoing challenge is to construct such theoretical frameworks that bridge the divide between abstract mathematical results and empirical success in machine learning. This union promises advances not only in artificial intelligence but also in the foundational understanding of field theories, thus benefiting both domains. The exploration remains open to further inquiry into symmetries, computational shortcuts, and the foundational principles of learned representations.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (1)

Jim Halverson

Tweets

https://twitter.com/jhhalverson/status/1843647647456080144

https://twitter.com/thePiggsBoson/status/1930302696717680728

https://twitter.com/KwekuOA/status/1843820542811680960

https://twitter.com/EltaiefAbir/status/1843714526644240502