In-Context Learning of Linear Dynamical Systems with Transformers: Error Bounds and Depth-Separation (2502.08136v2)

Published 12 Feb 2025 in cs.LG and stat.ML

Abstract: This paper investigates approximation-theoretic aspects of the in-context learning capability of the transformers in representing a family of noisy linear dynamical systems. Our first theoretical result establishes an upper bound on the approximation error of multi-layer transformers with respect to an $L^2$-testing loss uniformly defined across tasks. This result demonstrates that transformers with logarithmic depth can achieve error bounds comparable with those of the least-squares estimator. In contrast, our second result establishes a non-diminishing lower bound on the approximation error for a class of single-layer linear transformers, which suggests a depth-separation phenomenon for transformers in the in-context learning of dynamical systems. Moreover, this second result uncovers a critical distinction in the approximation power of single-layer linear transformers when learning from IID versus non-IID data.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

In-Context Learning of Linear Dynamical Systems with Transformers: Error Bounds and Depth-Separation (2502.08136v2)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (4)