Consistency of informative source detection via cross-validation for Transfer MNI
Establish the consistency of the K-fold cross-validation–based informative source detection procedure used for the Informative-Weighted Transfer Minimum-ℓ2-Norm Interpolator (WTM) in high-dimensional linear regression with benign overfitting; specifically, prove that with high probability the estimated informative set \widehat{I} = { q ∈ [Q] : \widehat{L}(\hat{β}_{TM}^{(q)}) − \widehat{L}(\hat{β}_{M}^{(0)}) ≤ D^{(0)} } equals the oracle informative set I = { q ∈ [Q] : R(\hat{β}_{TM}^{(q)}) − R(\hat{β}_{M}^{(0)}) < 0 }, where \hat{β}_{TM}^{(q)} is the Transfer MNI (pre-trained on source q and fine-tuned on the target), \hat{β}_{M}^{(0)} is the target-only MNI, \widehat{L}(·) denotes K-fold cross-validation loss on the target, and R(·) denotes excess risk on the target distribution.
References
Specifically, we aim to establish the consistency of informative source detection via cross-validation in Algorithm \ref{alg:CV_WTM} by proving that the event \cI = \widehat\cI holds with high probability, where \cI is the oracle set in~eq:true_infosource and \widehat\cI is its CV-driven estimate in eq:est_infosource. Establishing consistency of transferability detection in this regime remains open and would meaningfully advance the relevant literature.