Scaling VQDNA Without Losing Efficiency and Performance
Design and validate scaling strategies that increase the capacity or parameter count of VQDNA while maintaining its demonstrated merits (e.g., parameter efficiency and performance), overcoming current computational constraints that limited scaling in this work.
References
There are several limitations in this work: (1) The superiority of VQDNA stems from its genome vocabulary learning, which is an additional training stage with extra costs compared to other models. Thus, there is still room for reducing its computational overhead to boost its applicability. (2) Due to the computational constraints, the model scale of VQDNA has not reached its maximum. How to scale up VQDNA while maintaining the gained merits is worth exploring. (3) As the HRQ vocabulary has shown great biological significance in SARS-CoV-2 mutations, broader applications in genomics with VQDNA, such as generation tasks, deserve to be studied. Overall, all these avenues remain open for our future research.