Papers
Topics
Authors
Recent
Search
2000 character limit reached

Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics

Published 5 Apr 2026 in cs.LG and q-bio.QM | (2604.03911v1)

Abstract: Generating molecular dynamics (MD) trajectories using deep generative models has attracted increasing attention, yet remains inherently challenging due to the limited availability of MD data and the complexities involved in modeling high-dimensional MD distributions. To overcome these challenges, we propose a novel framework that leverages structure pretraining for MD trajectory generation. Specifically, we first train a diffusion-based structure generation model on a large-scale conformer dataset, on top of which we introduce an interpolator module trained on MD trajectory data, designed to enforce temporal consistency among generated structures. Our approach effectively harnesses abundant structural data to mitigate the scarcity of MD trajectory data and effectively decomposes the intricate MD modeling task into two manageable subproblems: structural generation and temporal alignment. We comprehensively evaluate our method on the QM9 and DRUGS small-molecule datasets across unconditional generation, forward simulation, and interpolation tasks, and further extend our framework and analysis to tetrapeptide and protein monomer systems. Experimental results confirm that our approach excels in generating chemically realistic MD trajectories, as evidenced by remarkable improvements of accuracy in geometric, dynamical, and energetic measurements.

Summary

  • The paper introduces a two-stage framework that decouples spatial structure pretraining from temporal dynamics interpolation for MD trajectory generation.
  • It leverages an SE(3)-equivariant diffusion model pre-trained on large conformer datasets to accurately encode diverse molecular geometries.
  • Experimental results show reduced Jensen-Shannon divergences and improved energy profile fidelity, outperforming baseline models in precision and generalizability.

Structure Pretraining for Molecular Dynamics: EGINTERPOLATOR

Introduction and Problem Setting

Molecular dynamics (MD) simulations play a central role in computational chemistry, biology, and materials science by enabling the study of atomistic trajectories on physically realistic timescales. However, high-fidelity MD simulations remain computationally demanding, especially for long timescales and chemically diverse systems, which leads to severe data scarcity for ML-based trajectory modeling. Conventional generative approaches, including geometric diffusion models, have advanced static structural generation but do not address temporal dependencies necessary for MD. Direct generative modeling of MD trajectories—especially for arbitrary, out-of-distribution molecules—faces two core bottlenecks: the lack of large-scale, diverse MD datasets and the complexity of high-dimensional temporal distributions.

EGINTERPOLATOR presents a principled two-stage solution: first, a structure (conformer) diffusion model is pretrained on large, diverse conformer datasets, encoding molecular structure distributions in an SE(3)-equivariant fashion. Second, a temporal interpolator module is introduced and finetuned on scarce MD trajectory data, learning temporal dependencies while leveraging the pretrained structure module as an anchor. The MD generation task is thereby decomposed into spatial (structure) and temporal (dynamics) subproblems, each addressed by specialized inductive biases.

Methodology

Structure Pretraining

The first stage trains a diffusion-based model for conformer generation using large-scale static datasets (e.g., GEOM-QM9, GEOM-Drugs), parameterized by an SE(3)-equivariant architecture (EGCL). This yields a model that robustly generates chemically consistent molecular geometries across diverse chemical spaces.

Temporal Interpolator

The core innovation is the temporal interpolator, which enforces temporal consistency between frames by interpolating between the pretrained structure denoiser and a novel, learnable temporal module (built from equivariant temporal attention). The interpolation coefficient (learned, via a sigmoid of a trainable parameter) adjusts the relative contributions from the spatial and temporal modules. This decoupling offers several advantages:

  • Reduces optimization complexity by anchoring temporal learning to a robust structural prior.
  • Allows for flexible inference, including recovery of i.i.d. conformer sampling and true temporally consistent trajectories.
  • Preserves SE(3)-equivariance, critical for data efficiency and generalizability.

A cascaded variant, EGINTERPOLATOR-CASC, applies interpolation at every block in the network to promote denser entanglement of spatial and temporal features, resulting in better overall modeling capacity.

Training and Evaluation

MD training is performed on available MD trajectories (QM9, GEOM-Drugs, tetrapeptides, and protein datasets), with ablations spanning unconditional generation, forward simulation, and interpolation tasks. The former two settings probe generative fidelity under zero or initial-frame conditioning, while interpolation (transition path sampling) tasks the model with generating trajectories consistent with known endpoints. Multiple rigorous metrics are computed, including Jensen-Shannon divergence (JSD) over geometrical and dynamical features, potential-energy fidelity, as well as MSM-based assessments of dynamical regimes.

Experimental Results

EGINTERPOLATOR achieves consistently lower JSDs for bond lengths, bond angles, and torsions, indicating that generated trajectories match ground truth MD much more closely than baselines such as GeoTDM, AR+EGNN, and AR+Equivariant Transformer. Notable empirical findings include:

  • Conformer Pretraining: The structure module matches SOTA (GeoDiff, GeoTDM) in coverage and matching while excelling in precision-relevant metrics for MD pretraining.
  • Unconditional Generation: Trajectories for both small organic and drug-like molecules are nearly indistinguishable from ground truth MD, with the cascaded interpolator yielding further improvement.
  • Forward Simulation: On larger molecules (Drugs dataset), EGINTERPOLATOR closely tracks ground-truth geometric and dynamical observables, especially for time-lagged independent components and autocorrelation of torsion angles.
  • Interpolation: When generating transition paths between metastable states, the model yields lower JSDs and higher high-probability valid path rates than MD oracle baselines of equivalent or greater length. Generated trajectories traverse key metastable states efficiently.
  • Metric Superiority: On tetrapeptides, EGINTERPOLATOR sharply decreases JSDs for both backbone and side-chain torsions, with superior MSM and dynamical metrics relative to baselines, indicating enhanced dynamical consistency and physical plausibility of generated ensembles.
  • Energy Profile Fidelity: The Wasserstein-1 distance between generated vs. reference energy profiles is reduced by an order of magnitude relative to GeoTDM, demonstrating improved energetic realism.
  • Ablations: Structure pretraining and the interpolator design are essential; removing either degrades all fidelity and dynamics measures. Freezing the structure encoder after pretraining does not reduce performance, indicating successful modular decomposition.

Theoretical and Practical Implications

EGINTERPOLATOR demonstrates that structure pretraining is an effective inductive bias for generative MD models, especially under data-limited regimes. By decoupling spatial and temporal learning, the approach enables:

  • Improved transfer to unseen chemical spaces, a critical property for real-world drug discovery and biomolecular simulation where new compounds lack extensive MD data.
  • Stable and efficient optimization due to the intermediate distribution induced by the interpolator, reducing the risk of mode collapse or degenerate dynamics.
  • Inference-time control over the tradeoff between spatial (structure) and temporal (dynamics) fidelity by tuning the interpolation coefficient.

This framework generalizes to larger biomolecules (peptides, proteins) and naturally accommodates block-wise or sliding window strategies for long trajectories, allowing for simulation over previously inaccessible timescales.

Limitations and Future Directions

While the performance of EGINTERPOLATOR surpasses current ML-based MD generative models, there remains a gap relative to ground truth MD in fine-grained dynamical and energetic observables, particularly at long timescales. Further advances could arise from:

  • Incorporating force or energy-based guidance, possibly via score-based Boltzmann generator conditioning, to enhance physical fidelity.
  • Integrating physics-based regularization into the loss function, aligning generated ensembles more closely with Boltzmann distributions.
  • Bridging the framework to other N-body applications beyond chemistry, leveraging multitask or multimodal learning schemes for universal trajectory generation.

Conclusion

EGINTERPOLATOR advances the state-of-the-art in generative modeling of molecular dynamics by leveraging structure pretraining and a flexible temporal interpolator. Through rigorous architectural design and comprehensive empirical validation, the method achieves strong performance on spatial, energetic, and dynamical metrics across chemically diverse MD benchmarks. The two-stage approach establishes a scalable, data-efficient blueprint for trajectory modeling in molecular science, with broad implications for computational discovery and biomolecular simulation.

Reference: "Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics" (2604.03911)

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.