Ordering of burn-in times for domain randomization versus robust control and certainty equivalence

Prove that, in the linear quadratic regulator learning setting analyzed in this paper, the burn-in time (the minimum number of experiments required for the sample-efficiency bounds to hold) for the domain randomization controller lies between the burn-in times of robust control and certainty equivalence when domain randomization uses the data-informed sampling distributions defined in this work.

Background

The paper compares sample-efficiency bounds for certainty equivalence (CE), robust control (RC), and domain randomization (DR) for learning LQR from many trajectories. CE and DR achieve the optimal asymptotic 1/N trace rate, while RC achieves a 1/N rate with an operator-norm dependence and often a smaller burn-in. Based on theory and experiments, the authors conjecture that DR’s burn-in time lies between RC and CE and state that a formal proof is left to future work.

Establishing this ordering would clarify how soon DR begins to provide finite, meaningful guarantees relative to CE and RC, complementing the asymptotic optimality results for DR and its observed empirical advantages in low-data regimes.

References

We further conjecture that the burn-in time for DR lies between that of RC and CE. We leave proving this to future work, but verify this conjecture numerically.

— Domain Randomization is Sample Efficient for Linear Quadratic Control (2502.12310 - Fujinami et al., 17 Feb 2025) in Subsection Contributions (Section 1)

Ordering of burn-in times for domain randomization versus robust control and certainty equivalence

Background

References

Related Problems