Cross-relation generalization in TCH-instantiated natural language training
Determine whether the linear truth encoding and associated behavioral effects learned when training a transformer language model on paired CounterFact examples from a single relation (e.g., WorksIn, SpeaksLanguage, BornIn) generalize to different relations not seen during training, and characterize the extent and robustness of such cross-relation transfer.
References
We leave the question of generalization between relations to a future work.
— Emergence of Linear Truth Encodings in Language Models
(2510.15804 - Ravfogel et al., 17 Oct 2025) in Section 5.2, Subsubsection "Instantiating the TCH in Natural Language" (Setup)