Generalization of LLaMA Pro block expansion beyond coding and mathematics

Ascertain whether the LLaMA Pro post-pretraining block expansion method remains effective in complex and open-ended domains outside coding and mathematics.

Background

LLaMA Pro expands Transformer blocks after pretraining and finetunes the expanded parts while freezing inherited blocks, showing gains in coding and mathematics.

The paper cautions that these results are domain-specific and that broader effectiveness is unverified.

References

Therefore, the effectiveness of the block expansion method in more complex and open-ended domains is yet to be verified.

Towards Incremental Learning in Large Language Models: A Critical Review (2404.18311 - Jovanovic et al., 28 Apr 2024) in Section 2.1 (Continual Learning) – LLAMA PRO