Mitigating teacher-forcing mismatch in rank-aware token-level training beyond the first token
Develop methods to mitigate the mismatch between rank-aware training with teacher forcing and inference-time generation for timesteps greater than one when using prefix-tree–based token-level target distributions within the SToICaL loss for autoregressive ranking.
Sponsor
References
We leave the question of mitigating this rank-aware training-inference mismatch for $t>1$ to future work.
— Autoregressive Ranking: Bridging the Gap Between Dual and Cross Encoders
(2601.05588 - Rozonoyer et al., 9 Jan 2026) in Section 4.3 (Prefix Tree for Rank-Aware Token-Level Target Distributions), footnote