Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Published 1 May 2025 in cs.LG and cs.AI | (2505.00598v2)

Abstract: To address the challenge of scarce computational resources in genomic modeling, we introduce GERM, a genomic foundation model with strong compression performance and fast adaptability. GERM improves upon models like DNABERT-2 by eliminating outliers that hinder low-rank adaptation and post-training quantization, enhancing both efficiency and robustness. We replace the vanilla attention layer with an outlier-free mechanism inspired by associative memory models. By removing outliers during both pre-training and fine-tuning, this approach accelerates adaptation, reduces computational costs, and enhances quantization robustness within acceptable loss margins. Additionally, we propose GERM-T, a strategy that employs small-step continual learning within the outlier-free framework, leveraging original checkpoints to avoid retraining from scratch. Empirically, GERM improves fine-tuning performance by 37.98% and quantization by 64.34% over the baseline model. It also reduces average kurtosis by 92.14% and maximum infinity norm by 82.77%. Compared to leading methods, GERM consistently delivers superior performance, offering a practical solution for genomic modeling in resource-constrained settings. Code is available at https://github.com/MAGICS-LAB/GERM.

Abstract PDF Upgrade to Chat

Authors (8)

Summary

An Examination of Germ: Enhanced Genomic Foundation Model with Outlier Removal

The paper introduces Germ, a genomic foundation model designed to address the inefficiencies in genomic modeling due to computational constraints and outlier values in model attention mechanisms. Germ distinguishes itself by incorporating a novel attention modification inspired by modern associative memory models, aiming to improve the efficiency and adaptability of DNA-based genomic models like DNABERT-2. The authors focus on enhancing both fine-tuning and quantization processes through outlier mitigation, resulting in notably superior performance metrics compared to leading genomic modeling methods.

Technical Innovations and Model Design

Germ replaces the traditional transformer attention layer with an outlier-free mechanism that effectively removes outliers during both pre-training and fine-tuning stages. This modification is intended to accelerate adaptation processes, reduce computational costs, and bolster quantization robustness—critical factors for deploying genomic models on resource-limited platforms such as mobile devices and edge computing systems.

Additionally, the authors propose Germ-T, which integrates incremental learning steps post-initial training to further optimize model performance, avoiding the extensive computational demands associated with retraining from scratch. This continual learning strategy leverages existing frameworks to achieve substantial efficiency improvements, asserting that Germ-T provides optimal performance in genomic modeling tasks by maintaining precision and computational efficiency.

Empirical Results

Empirical analysis demonstrates Germ’s remarkable improvements across various genomic prediction tasks. Specifically, Germ achieves a 37.98% improvement in fine-tuning performance and a 64.34% enhancement in quantization efficiency compared to DNABERT-2. Moreover, Germ exhibits a dramatic decrease in average kurtosis (92.14%) and maximum infinity norm (82.77%), underscoring its effectiveness in managing outlier issues within genomic models. The results consolidate Germ’s position as a practical solution for genomic modeling within environments constrained by computational resources.

Broader Implications

Practically, Germ provides a robust framework for deploying genomic models in settings where computational power and accessibility to advanced hardware are limited. Theoretically, the framework underscores the influence of attention mechanisms and outlier removal techniques in optimizing model performance across various genres of foundation models. The authors suggest potential applications of this outlier removal strategy in other domains, such as NLP models, where attention bottlenecks could be similarly constricted.

The paper further speculates on future developments in AI, emphasizing the importance of balancing computational efficiency against performance. It posits that as genomic databases expand, and computational constraints persist, techniques like those employed in Germ may become increasingly relevant, aiding not only genomics but also extending into other data-intensive disciplines.

In conclusion, Germ emerges as an integral advancement in genomic modeling, responding adeptly to the challenges posed by computational limits and inherent inefficiencies in existing model architectures. Its reliance on associative memory models to refine attention mechanisms sets a precedent for future innovations in foundation models across diverse scientific fields. The research invites further exploration into refining outlier mitigation techniques and broadening their applicability, potentially sparking improvements in AI-driven genomic research and beyond.

Markdown Report Issue