HyenaDNA: Efficient Genomic Foundation Model
- HyenaDNA is a genomic foundation model that uses implicit convolution and gating to achieve long-range sequence modeling at single-nucleotide resolution.
- It employs fast Fourier transform-based convolutions to attain O(NlogN) runtime and favorable scaling for processing entire genomes and RNA sequences.
- Adaptable via plug-in adapters, HyenaDNA enhances downstream tasks like rare disease gene discovery and regulatory element classification with reduced computational cost.
HyenaDNA is a genomic foundation model designed for long-range sequence modeling at single-nucleotide resolution, leveraging implicit-convolution architectures to enable order-of-magnitude scaling beyond attention-based transformers. It provides a universal, efficient, and highly parameter-efficient backbone for DNA and, via adapters, RNA modeling. HyenaDNA and related approaches underpin advances in downstream prediction, rare disease gene discovery, and general retrieval-augmented inference across genomic modalities.
1. Architectural Principles of HyenaDNA
HyenaDNA replaces the quadratic-cost self-attention mechanism in standard transformers with the Hyena "implicit convolution + gating" operator. Each Hyena block computes
where are diagonal matrices from learned projections of the input , and is a Toeplitz matrix parameterizing a global 1D convolution. The convolution filter is generated via a small MLP (as a neural field) rather than being directly learned, which decouples parameter count from window size. Each block includes pointwise nonlinearity (GELU), layer normalization, and a feed-forward network, with residual connections throughout.
A defining feature is the ability to process single-nucleotide tokens, eschewing fixed -mer tokenization to preserve maximal nucleotide resolution—crucial for tasks involving SNPs or rare mutations. Architectural efficiency stems from the use of fast Fourier transform-based convolution ( time/space per layer), making HyenaDNA orders of magnitude more efficient in both parameter count and runtime for long genomic inputs (Nguyen et al., 2023, Du et al., 6 Aug 2025).
2. Computational Complexity and Scaling
Traditional transformer models exhibit time and memory complexity, limiting input lengths (typically $512$–0 bases) to a small fraction of the human genome. In contrast, each HyenaDNA layer, via implicit convolution and diagonal gating, achieves 1 runtime and 2 memory, where 3 is sequence length and 4 embedding dimension. Empirical benchmarks demonstrate up to 1605 speedup relative to FlashAttention-Transformers at 6 tokens, with practical context windows up to 7 nucleotides and efficient scaling to tens of layers (8–9), allowing true whole-gene or ultra-long-range context (Nguyen et al., 2023, Datta et al., 6 Aug 2025).
A comparison of key operational metrics is summarized below:
| Model | Max Context | Params (M) | Time Complexity |
|---|---|---|---|
| HyenaDNA | 0 | 4–6.5 | 1 |
| DNABERT/Enformer | 2 | 20–80 | 3 |
| NucleotideTransf. | 4 | 500–2\,500 | 5 |
3. Training Regime and Embedding Strategies
HyenaDNA is pre-trained as a masked LLM on the full human reference genome (GRCh38/hg38), using contexts of up to 6 bases. The objective is standard cross-entropy over masked positions:
7
where 8 flags masked positions. The model learns nucleotide representations at high resolution.
For downstream inference, HyenaDNA serves as a frozen feature extractor (retrieval-augmented pipeline). For input 9 of length 0, the output 1 from the last layer is mean-aggregated and L2-normalized to yield sequence embedding 2, which is suitable for k-NN retrieval or as input to lightweight classifiers (Datta et al., 6 Aug 2025).
Enhancer classification with z-Curve features, which capture 3D sequence geometry, demonstrates that HyenaDNA embeddings combined with z-Curve systematically improve accuracy (from 3 base to 4 with z-Curve) and achieve inference >85 faster with 86 lower CO7 emissions than fine-tuned models (Datta et al., 6 Aug 2025).
4. Adapter-Based Modalities: CodonMoE and RNA Analysis
HyenaDNA forms the basis for plug-in adapters enabling DNA-trained models to function on RNA-centric tasks. The CodonMoE adapter applies a codon-level mixture-of-experts (MoE) on HyenaDNA outputs:
- Every three nucleotides are averaged to form codon embeddings 8.
- A gating network 9 assigns weights 0 for 1 expert MLPs 2.
- Output:
3
- Codon features are tiled back to nucleotide resolution, residual connections and normalization applied, and a lightweight head yields the property prediction.
The CodonMoE architecture is a universal approximator for codon-to-RNA mappings, with formal guarantees. Standard configurations (4–5 experts, 6–7M params) or the "pro" version (8M params) add 1D convolutions over codon neighborhoods, all maintaining sub-quadratic complexity, 9 (Du et al., 6 Aug 2025).
Benchmarks in mRNA expression and stability demonstrate state-of-the-art rank correlations (Spearman's 0 up to 1), with HyenaDNA+CodonMoE matching or exceeding specialized RNA models (CodonBERT, SpliceBERT) at %%%%52053%%%% of their parameter count, and delivering 5–104 faster inference (Du et al., 6 Aug 2025).
5. Empirical Results and Downstream Applications
HyenaDNA sets new top-1 accuracy on regulatory element classification, enhancer detection, chromatin profile prediction, and species assignment, often outperforming baseline CNNs, DNABERT, GPT-style transformers, and Nucleotide Transformer models. On GenomicBenchmarks, it achieves 5–6 point improvements over prior state-of-the-art for multiple tasks (Nguyen et al., 2023).
Embedding-extraction pipelines using HyenaDNA maintain strong predictive performance across data splits with shifted distributions, indicating superior generalization to unseen genomic contexts compared to models reliant on full fine-tuning. Carbon calculations confirm 7–8 lower emissions for retrieval-augmented approaches.
In rare disease genomics (Saadat et al., 2024):
- HyenaDNA is used to generate sample- and variant-aware gene embeddings by processing full gene sequences personalized by individual pathogenic variants.
- Embeddings at variant positions are averaged to yield a dynamic gene embedding 9, which is sensitive to deleterious changes.
- Embeddings across genes are used as features in a protein-protein interaction (PPI) graph neural network and further refined by a genetic algorithm to extract functionally-coherent diagnosis subnetworks.
- The workflow re-identifies known disease genes (e.g., IFIH1 with 0, 1) and pathways (e.g., interferon signaling), validating the capacity for interpretable target discovery.
6. Broader Implications and Future Directions
The HyenaDNA framework demonstrates that single nucleotide–resolution models with sub-quadratic scaling enable both parameter-efficient and computation-efficient solutions for broad genomic inference tasks. CodonMoE and similar adapters "RNA-ize" pretrained DNA models without full RNA pretraining, illustrating a unifying template for multi-modality in genomics. The plug-and-play MoE principle is extensible: analogous adapters could exploit amino-acid context for protein tasks or be designed for locus-level tasks in chromatin modeling (Du et al., 6 Aug 2025).
Practically, HyenaDNA’s architecture and embedding strategies favor scalable, low-footprint, and robust solutions, making them suitable for resource-constrained high-throughput genomics and for systematic interrogation of genome function in rare or common diseases. The variant-sensitivity of embeddings highlights the model’s utility for personalized genomics, offering precise and explainable representations optimized for downstream machine learning integration.
A plausible implication is that continued refinement of such architectures and adapter-based multi-modal pipelines may further reduce the computational and carbon cost of universal genomic modeling while enabling rapid, interpretable hypothesis generation across increasingly complex biological tasks.