Designing randomization baselines for transcoders and crosscoders

Develop appropriate randomization baselines for transcoders and crosscoders that account for their distinct training objectives, enabling rigorous evaluation analogous to frozen-component baselines used for Sparse Autoencoders.

Background

The paper introduces frozen-component baselines for SAEs to test whether these models learn meaningful features beyond random initialization. However, the study focuses on standard SAE architectures and does not include related methods such as transcoders or crosscoders, which have different training objectives.

The authors explicitly note that creating comparable randomization baselines for these alternative approaches is currently unresolved, indicating a need for method-specific baseline design to support rigorous, cross-method evaluation.

References

Second, we focus on standard SAE architectures and do not evaluate related approaches such as transcoders or crosscoders; designing appropriate randomization baselines for these methods, given their different training objectives, remains an open challenge.

— Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines? (2602.14111 - Korznikov et al., 15 Feb 2026) in Section 6 (Limitations)

Designing randomization baselines for transcoders and crosscoders

Background

References

Related Problems