Papers
Topics
Authors
Recent
Search
2000 character limit reached

Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups

Published 28 Oct 2024 in cs.CL and cs.AI | (2410.21508v1)

Abstract: Sparse AutoEnocders (SAEs) have recently been employed as an unsupervised approach for understanding the inner workings of LLMs. They reconstruct the model's activations with a sparse linear combination of interpretable features. However, training SAEs is computationally intensive, especially as models grow in size and complexity. To address this challenge, we propose a novel training strategy that reduces the number of trained SAEs from one per layer to one for a given group of contiguous layers. Our experimental results on Pythia 160M highlight a speedup of up to 6x without compromising the reconstruction quality and performance on downstream tasks. Therefore, layer clustering presents an efficient approach to train SAEs in modern LLMs.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.