Validity of the content-subspace projection assumption
Ascertain whether the residual component in the per-example steering vector d_k—computed as the mean hidden state at behavior-boundary paragraphs minus the mean hidden state at execution paragraphs—lies within the content subspace estimated from question-only hidden states via SVD at the target layer used for steering vector construction.
References
This is a heuristic: we do not know that the residual in $d_k$ falls exactly in this subspace, but question-only representations provide a reasonable proxy for the directions we want to suppress.
— Reliable Control-Point Selection for Steering Reasoning in Large Language Models
(2604.02113 - Zhuang et al., 2 Apr 2026) in Section 3 (Method), Content-Subspace Projection