Cross-scale transfer of T2L adapters within the same architecture class

Determine whether Text-to-LoRA (T2L) hypernetworks trained on smaller base models can transfer effectively to larger models within the same architecture class, and characterize the conditions under which such transfer achieves robust downstream performance without retraining the hypernetwork from scratch.

Background

Text-to-LoRA (T2L) is a hypernetwork that generates LoRA adapters from natural-language task descriptions to adapt LLMs. The paper demonstrates T2L’s effectiveness across several base models (Mistral, Llama, Gemma) at fixed sizes and shows zero-shot task-specific adaptation.

A key unresolved question is whether a T2L hypernetwork trained on a smaller base model can be reused or transferred to larger models within the same architecture family (e.g., scaling from smaller to larger Llama or Mistral variants) while maintaining or improving performance. This addresses practical deployment scenarios where training on large models is costly, and transfer from smaller models would be highly beneficial.

References

Finally, the potential for T2L trained on a smaller base model to transfer effectively to larger models within the same architecture class remains an open area for exploration.

— Text-to-LoRA: Instant Transformer Adaption (2506.06105 - Charakorn et al., 6 Jun 2025) in Discussion and Limitations (Discussion)

Cross-scale transfer of T2L adapters within the same architecture class

Background

References

Related Problems