Targeted SAE clustering for sentiment and emotion labels
Identify a combination of natural-language query terms and a top-k latent selection parameter that enables sparse autoencoder–based targeted clustering to align with ground-truth sentiment and emotion labels on Twitter datasets (SemEval-2017 Task 4 for sentiment and CARER for emotion), thereby determining whether SAE embeddings can recover these label structures under appropriate filtering.
Sponsor
References
For our SAE method, we were unable to find a good combination of queries and k.
— Interpretable Embeddings with Sparse Autoencoders: A Data Analysis Toolkit
(2512.10092 - Jiang et al., 10 Dec 2025) in Appendix: Additional Results—Clustering, subsection “Failure to recover ground truth labels for sentiment and emotion clustering”