A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension (2412.06245v2)

Published 9 Dec 2024 in cs.CL

Abstract: The performance of LLMs on natural language tasks can be improved through both supervised fine-tuning (SFT) and in-context learning (ICL), which operate via distinct mechanisms. Supervised fine-tuning updates the model's weights by minimizing loss on training data, whereas in-context learning leverages task demonstrations embedded in the prompt, without changing the model's parameters. This study investigates the effects of these learning paradigms on the hidden representations of LLMs using Intrinsic Dimension (ID). We use ID to estimate the number of degrees of freedom between representations extracted from LLMs as they perform specific natural language tasks. We first explore how the ID of LLM representations evolves during SFT and how it varies due to the number of demonstrations in ICL. We then compare the IDs induced by SFT and ICL and find that ICL consistently induces a higher ID compared to SFT, suggesting that representations generated during ICL reside in higher dimensional manifolds in the embedding space.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension (2412.06245v2)

Collections

Summary

Follow-up Questions

Authors (2)

Don't miss out on important new AI/ML research

A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension (2412.06245v2)

Collections

Summary

Follow-up Questions

Related Papers

Authors (2)

Don't miss out on important new AI/ML research