Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features (2212.13881v3)

Published 28 Dec 2022 in cs.LG, cs.AI, and stat.ML

Abstract: In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear. Identifying such a mechanism is key to advancing performance and interpretability of neural networks and promoting reliable adoption of these models in scientific applications. In this paper, we identify and characterize the mechanism through which deep fully connected neural networks learn features. We posit the Deep Neural Feature Ansatz, which states that neural feature learning occurs by implementing the average gradient outer product to up-weight features strongly related to model output. Our ansatz sheds light on various deep learning phenomena including emergence of spurious features and simplicity biases and how pruning networks can increase performance, the "lottery ticket hypothesis." Moreover, the mechanism identified in our work leads to a backpropagation-free method for feature learning with any machine learning model. To demonstrate the effectiveness of this feature learning mechanism, we use it to enable feature learning in classical, non-feature learning models known as kernel machines and show that the resulting models, which we refer to as Recursive Feature Machines, achieve state-of-the-art performance on tabular data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Adityanarayanan Radhakrishnan (22 papers)
  2. Daniel Beaglehole (10 papers)
  3. Parthe Pandit (25 papers)
  4. Mikhail Belkin (76 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.