2000 character limit reached
Dynamical Behaviors of the Gradient Flows for In-Context Learning (2412.16683v1)
Published 21 Dec 2024 in math.DS
Abstract: We derive the system of differential equations for the gradient flow characterizing the training process of linear in-context learning in full generality. Next, we explore the geometric structure of the gradient flows in two instances, including identifying its invariants, optimum, and saddle points. This understanding allows us to quantify the behavior of the two gradient flows under the full generality of parameters and data.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.