Learning low logit rank models using only conditional sampling
Determine whether approximately low logit rank language models can be learned to total variation error ε in time and queries polynomial in T, d, |Σ|, α, 1/δ, and 1/ε using only a conditional sampling oracle that returns y_{t+1} sampled from M(· | y_{1:t}), i.e., without logit query access.
Sponsor
References
We ask if it is possible to obtain our results only under this weaker access (without suffering exponential dependence on the value of the logits, as in \cref{rmk:conditional-sampling}): Can we learn (approximately) low-logit rank models to error $\ep$ in $\poly(T,d,|\Sigma|, \alpha, 1/\delta, 1/\ep)$ time using only a conditional sampling oracle?
— Provably Learning from Modern Language Models via Low Logit Rank
(2512.09892 - Golowich et al., 10 Dec 2025) in Conclusions and Future Directions, Learning from conditional samples paragraph