Crystalite: A Lightweight Transformer for Efficient Crystal Modeling
Abstract: Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper introduces Crystalite, a new computer model that helps scientists design and predict crystal structures. Crystals are materials where atoms are arranged in repeating patterns, like tiles on a floor. Finding new useful crystals (for batteries, chips, solar cells, etc.) is hard because there are countless possible arrangements, and checking each one with detailed physics calculations is slow. Crystalite uses machine learning to quickly suggest likely, stable crystal structures and even invent new ones.
What questions are the researchers asking?
- Can a simpler, faster AI model match or beat more complex, heavy models at predicting crystal structures?
- How can we give a standard Transformer (the kind of AI used in LLMs) just enough knowledge about chemistry and geometry so it handles crystals well without slowing down?
- Can such a model both:
- Predict the correct structure for a known recipe of atoms (crystal structure prediction), and
- Invent brand-new, stable, unique crystals (de novo generation)?
How does Crystalite work? (Methods explained simply)
Think of crystal design like reconstructing a picture that’s been covered with TV static: you start from noise and gradually remove it to reveal the image. Crystalite uses a “diffusion” process that does exactly this—adds noise and learns how to remove it step by step to arrive at a realistic crystal.
To make this work well and fast, Crystalite adds two simple but smart ideas to a standard Transformer:
- Subatomic Tokenization:
- Instead of labeling each element (like O, Ti, Li) with a big, clumsy ID tag, Crystalite gives each element a short, meaningful “fingerprint.” This fingerprint encodes things like the element’s row and column on the periodic table and how many electrons are in its outer shell.
- Analogy: Instead of saying “this person is #57,” you describe them by features like age, height, and hair color. That makes it easier to notice who is similar to whom.
- Why it helps: The model sees chemical similarities (like sodium and potassium) and can “slide” smoothly between nearby elements when learning, which suits the diffusion process and reduces the chance of memorizing common compositions.
- Geometry Enhancement Module (GEM):
- Atoms in crystals repeat in all directions, like a video game map that wraps around when you go off one edge. This is called periodic boundary conditions.
- GEM calculates how close atoms really are in this wrap-around world and gently nudges the model’s attention to focus more on atoms that are likely to interact.
- Analogy: If you’re giving directions in a wrap-around city, GEM tells you who your true neighbors are, even if they look far on the map but are actually next door due to the wrap-around.
- Why it helps: It gives the Transformer a sense of geometry without using heavy, slow math, speeding up generation while keeping structures realistic.
Crystalite is trained for two tasks:
- Crystal Structure Prediction (CSP): Given the list of atoms, predict the most likely arrangement and cell shape.
- De Novo Generation (DNG): Invent both the list of atoms and their arrangement from scratch.
It learns from real crystal databases and is evaluated on whether its results are valid, accurate, stable (won’t fall apart), unique, and truly new.
What did the researchers find?
- For crystal structure prediction, Crystalite reaches state-of-the-art performance on several benchmarks. It more accurately recovers the correct shapes and positions of atoms (lower geometric error) and matches known structures more often than previous methods.
- For generating new crystals, Crystalite achieves the best SUN score among compared models. SUN stands for Stable, Unique, and Novel—three key qualities you want in new materials:
- Stable: can exist without falling apart.
- Unique: not just a duplicate of what you’ve already made.
- Novel: not something already known in the database.
- It’s fast. Crystalite samples (creates) crystals much faster than geometry-heavy models. In head-to-head timing tests, it generates batches of crystals several times quicker while keeping quality high.
- There’s a trade-off between diversity and stability:
- As the model gets better at matching the training data, stability usually improves—but uniqueness and novelty can drop because it starts repeating familiar patterns.
- The authors manage this by lowering how strongly the model tries to predict the exact element types, which keeps diversity higher for longer.
- GEM makes structures more precise and stable by improving how the model handles distances and neighbors in the repeating crystal grid.
Why does this matter?
- Faster discovery: Crystalite can quickly propose promising crystal structures, helping scientists explore the huge space of possibilities more efficiently before running expensive physics checks.
- Practical design: Because it’s simpler and faster than many earlier models, Crystalite can be scaled up to screen many more candidates, speeding up research into better batteries, semiconductors, magnets, and more.
- Smarter simplicity: The work shows you don’t always need very complex geometry-heavy AI to do well. With the right chemical fingerprints and a clever geometry nudge, a streamlined Transformer can perform strongly.
In short, Crystalite combines smart chemical and geometric hints with a fast, simple Transformer to predict and invent crystals accurately and quickly. This could help accelerate the search for new materials that power future technologies.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a consolidated list of concrete gaps and open questions that the paper leaves unresolved, prioritized to guide future research.
- Geometry inductive bias scope:
- The Geometry Enhancement Module (GEM) injects only pairwise minimum-image distances and displacements as additive attention biases; it does not encode angular, dihedral, or multi-body geometric information. Would incorporating higher-order geometric features further improve structural fidelity and stability?
- GEM operates at O(N2) pairwise complexity and has no neighbor cutoffs or sparsification; how does this scale to larger cells (e.g., >100 atoms) or dense structures, and can neighbor lists or locality-aware attention preserve speed while retaining accuracy?
- GEM uses minimum-image metrics but ignores symmetry-group information; can explicit space-group or Wyckoff-position awareness be integrated without sacrificing transformer simplicity?
- Periodicity and loss formulation:
- Training on coordinates uses a componentwise wrapped residual on the torus, while GEM’s geometry uses a minimum-image under the lattice metric; the mismatch between training loss and attention bias could induce suboptimal gradients. Would a metric-aware toroidal loss (geodesic/minimum-image in lattice metric) improve convergence and accuracy?
- Noise on fractional coordinates is added via Gaussian in a centered Euclidean cube and wrapped, which is only an approximation to diffusion on a torus. How do more principled torus diffusion schemes (e.g., SDEs on Lie groups) compare in practice?
- Lattice representation and invariances:
- The lower-triangular lattice parameterization is basis-dependent; the model relies on Niggli reduction and does not augment over equivalent bases. How sensitive is performance to cell-choice ambiguities and Niggli-reduction edge cases, and would basis-augmentation or basis-invariant objectives reduce this sensitivity?
- The single global lattice token may be too coarse to capture long-range lattice-structure couplings; would structured lattice encodings (e.g., multiple tokens or hierarchical representations) improve generation and CSP accuracy?
- Subatomic Tokenization limits:
- The tokenization uses period, group, block, and valence-shell occupancies (ground-state, gas-phase properties) but omits key chemistry signals such as electronegativity, ionic/covalent radii, oxidation states, typical coordination preferences, and spin states. Does enriching tokens with such properties improve compositional and structural realism?
- Nearest-token decoding restricts generation to a fixed element set and may induce discontinuities at element boundaries during sampling. How robust is decoding when tokens drift between chemically similar elements, and can continuous-to-discrete mappings be made smoother or learned end-to-end?
- Generalization to unseen elements (beyond the 89 used) or to isotopes/allotropes is not evaluated; what modifications are needed for element-extrapolative generation?
- Composition modeling and novelty:
- The number of atoms N is sampled from the empirical training distribution rather than modeled; can N be learned or controlled (e.g., via an explicit count prior or autoregressive head) to improve controllability and novelty?
- The model exhibits composition memorization pressures (necessitating heavy downweighting of atom-type loss). Are there principled mechanisms (e.g., mutual-information regularizers, novelty constraints, or explicit compositional priors) to balance stability vs. diversity beyond heuristic loss weights?
- Charge neutrality, oxidation-state consistency, and stoichiometric validity are not enforced by construction; can these constraints be integrated into training or decoding to reduce invalid compositions while maintaining novelty?
- Stability evaluation and benchmarking:
- Stability and SUN rely on MLIP-based (NequIP) relaxations rather than DFT; the extent to which MLIP relaxations correlate with DFT ground-truth remains uncertain. How do rankings and SUN change under DFT validation for a representative subset?
- Sensitivity to the choice of MLIP (architecture, training data, and uncertainty) is not characterized. Would ensembles or uncertainty-aware relaxations yield more reliable stability assessments and checkpoint selection?
- Sample-extensive metrics (e.g., UN) decrease with sample count, but large-scale (e.g., ≥106) generation is shown only for ADiT vs. Crystalite. What are the asymptotics across more baselines, and how should standardized budgets be set for fair comparisons?
- Generalization and domain coverage:
- De novo generation is evaluated primarily on MP-20; generalization to larger cells (e.g., MPTS-52-level sizes), complex frameworks (e.g., MOFs), low-dimensional crystals (2D materials), and systems with large pores or long-range order is not reported.
- The method does not address disorder, partial occupancies, defects, dopants/solid solutions, vacancies, or interstitials. What adaptations are required to handle these prevalent real-world crystal phenomena?
- Magnetic ordering, charge states, and electron count/spin degrees of freedom are not modeled; can conditioning or auxiliary channels capture these effects for materials where magnetism or redox chemistry is essential?
- Sampling and training heuristics:
- The channel-wise anti-annealing is a heuristic time-warp without theoretical guarantees; its stability, generality across datasets, and interaction with different noise schedules or samplers (beyond EDM Heun) are not fully explored. Can principled adaptive samplers yield similar gains?
- Speed–quality trade-offs (e.g., number of diffusion steps, FlashAttention variants) are not systematically ablated; what is the Pareto frontier of sampling cost vs. SUN/accuracy, and how does it compare to equivariant baselines across varying budgets?
- Symmetry and crystallographic fidelity:
- Space-group prediction accuracy, Wyckoff-site recoveries, and symmetry-consistency of generated structures are not reported. Does GEM improve symmetry fidelity, and can explicit symmetry-aware heads or losses further reduce spurious symmetry breaking?
- Duplicate-equivalent cells (symmetry-equivalent or supercell/primitive-cell variants) can confound uniqueness metrics; how robust are the diversity metrics to cell choice, and should canonicalization beyond Niggli (e.g., symmetry-aware canonical cells) be introduced during evaluation?
- Robustness and uncertainty:
- No uncertainty estimates or run-to-run variability/error bars are reported for key metrics (CSP MR/RMSE, SUN). How stable are results across seeds and training repeats?
- The model has not been stress-tested for out-of-distribution compositions/timeframes beyond MPTS-52’s temporal shift (e.g., unseen chemistry families or extreme stoichiometries). What are failure modes under strong distribution shifts?
- Conditioning and inverse design:
- Property-conditioned generation and inverse design (e.g., stability targets, bandgap, ionic conductivity) are not explored; how can Crystalite be extended with property predictors or differentiable controllers for guided discovery?
- Controllability over space group, lattice type, or prototype (e.g., perovskite, spinel) is not provided; can discrete/continuous conditioning interfaces be added without degrading speed?
- Interpretability and analysis:
- Attention patterns with GEM are not analyzed; how do distance-based biases alter head specialization (local vs. long-range), and which geometric features are most used across noise levels?
- The contribution of each inductive bias (Subatomic Tokenization vs. GEM vs. anti-annealing) is only partially ablated; finer-grained ablations (e.g., GEM without distance term, different RBFs/Fourier encodings, or token chemistry variants) would clarify causal impacts.
- Practical deployment and synthesis relevance:
- Beyond energetic stability, practical synthesizability (e.g., kinetic accessibility, precursor availability, toxicity, or environmental constraints) is not evaluated. Can these downstream constraints be integrated into training objectives or post-selection filters?
- No evaluation of generated materials’ properties (mechanical, electronic, ionic) beyond stability is provided; does Crystalite produce candidates with desirable functional-property distributions?
- Implementation and reproducibility:
- Hyperparameter sensitivity (loss weights, PCA dimension d_H, number of heads/layers) and data preprocessing choices (e.g., Niggli reduction variants) are not systematically studied; robust default settings and sensitivity analyses would aid adoption.
- The fixed PCA basis for tokens is not learned jointly; would end-to-end learned chemical embeddings (initialized with periodic-table priors) outperform fixed compressed descriptors?
Practical Applications
Overview
Below are practical, real-world applications that follow directly from the paper’s findings and innovations (Crystalite’s lightweight diffusion Transformer, Subatomic Tokenization, and the Geometry Enhancement Module). Applications are grouped by deployment horizon and annotated with sectors, concrete tools/workflows that could emerge, and key assumptions/dependencies that affect feasibility.
Immediate Applications
- Crystal structure prediction (CSP) as a service (industry, academia; software)
- What: Deploy Crystalite for fast, accurate prediction of lattice and atomic positions given composition (SOTA MR and RMSE across MP-20, MPTS-52, Alex-MP-20).
- Tool/workflow: “CSP-lite” API for R&D groups to upload compositions and retrieve predicted structures; batch jobs integrated into materials informatics pipelines.
- Assumptions/dependencies: Composition is known and within training distribution (inorganic, small-to-moderate unit cells). Predictions still benefit from subsequent relaxation (MLIP/DFT) for final validation.
- High-throughput pre-screening to cut DFT queues (industry, academia; energy, semiconductors, catalysis; software/HPC)
- What: Use Crystalite to generate candidate structures rapidly (10k structures in minutes) and filter with a fast MLIP-relaxation step before committing expensive DFT.
- Tool/workflow: “Crystalite → MLIP relax → DFT shortlist” triage pipeline to reduce total compute and turnaround time.
- Assumptions/dependencies: Stability estimates via MLIP (e.g., NequIP) are proxies; final ranking needs DFT/experiment. Data and MLIP quality strongly influence recall/precision.
- Rapid de novo proposal generation for materials discovery campaigns (industry, academia; energy, electronics; software)
- What: Generate diverse, plausible crystal candidates optimized for SUN rate (stable–unique–novel) and tuned via loss-balancing and checkpoint selection.
- Tool/workflow: Weekly “proposal drop” into corporate/consortia material funnels, feeding domain-specific property screens (band gap, conductivity, elasticity).
- Assumptions/dependencies: Diversity/stability trade-offs must be managed; novelty can decline at large sampling scales unless training/sampling are tuned.
- Geometry-aware attention in atomistic Transformers (software, academia)
- What: Port GEM’s periodic minimum-image geometric biases as a plug-in to other Transformer backbones for atomistic tasks (e.g., interatomic potential learning, defect modeling).
- Tool/workflow: “GEM-attention” module library for PyTorch/JAX Transformers with periodic boundary condition support.
- Assumptions/dependencies: Benefits are strongest when periodic geometry matters; requires correct lattice handling and minimum-image calculations.
- Subatomic Tokenization as a reusable representation (software, academia; education)
- What: Replace one-hot element encodings with compact, chemically structured tokens to improve learning efficiency and interpolation in chemical space.
- Tool/workflow: Token feature library for property predictors, generative models, and dataset explorers; tutorials for students to visualize token neighborhoods.
- Assumptions/dependencies: Token design (period, group, block, valence occupancy) captures relevant chemistry for in-domain tasks; careful normalization/PCA alignment required.
- Interactive crystal ideation on a single GPU (software; education, SMEs)
- What: Exploit Crystalite’s fast sampling (seconds per 1k structures with optimized inference) to power an interactive “sketch-and-generate” UI for exploring compositions and structures.
- Tool/workflow: Web app where users input formula ranges or size constraints and get candidate structures with quick MLIP sanity checks.
- Assumptions/dependencies: Hardware availability (single modern GPU), MLIP in the loop for filtering, and guardrails against trivial memorization.
- Benchmarking and evaluation standardization (academia, policy; software)
- What: Adopt SUN and related metrics with explicit reporting of sample budgets; re-run baselines in unified pipelines (e.g., LeMat-GenBench).
- Tool/workflow: CI-ready evaluation scripts with fixed relaxation protocol; leaderboard submissions referencing sample-extensive vs. intensive metrics.
- Assumptions/dependencies: Community agreement on pipelines; consistent MLIP/DFT settings across studies.
- Curriculum and training modules in data-driven crystallography (academia; education)
- What: Use the open-source code to teach diffusion Transformers, periodic geometry handling, and evaluation trade-offs (stability vs. diversity).
- Tool/workflow: Lab exercises where students train small Crystalite models on subsets and analyze the novelty–stability frontier.
- Assumptions/dependencies: Classroom GPU access; curated subsets of MP-like datasets with permissive licenses.
- Cost and carbon footprint reduction in compute-heavy screening (industry, policy; sustainability)
- What: Replace a portion of brute-force DFT exploration with Crystalite+MLIP pre-filtering to lower compute cost and emissions.
- Tool/workflow: “Green-screening” policy in R&D roadmaps that mandates ML pre-screening prior to DFT.
- Assumptions/dependencies: Validated correlation between ML-screened ranks and DFT outcomes in target domains.
- Faster CSP for experimental interpretation (academia, industry; materials characterization)
- What: Provide candidate structures consistent with known composition to guide interpretation of diffraction or microscopy data during structure solution.
- Tool/workflow: “Suggest candidates” module in structure-solution suites to narrow search and reduce manual effort.
- Assumptions/dependencies: Not a replacement for full pattern fitting/refinement; additional conditioning (e.g., cell parameters) may be needed for tight experimental alignment.
Long-Term Applications
- Closed-loop autonomous materials discovery (industry, academia; robotics, lab automation)
- What: Integrate Crystalite into self-driving labs that generate candidates, simulate (MLIP/DFT), select, synthesize, and characterize in cycles.
- Tool/workflow: “Crystalite-in-the-loop” orchestrator with active learning for MLIP/DFT and feedback from experiments to retrain generation policies.
- Assumptions/dependencies: Robust sample management, synthesisability predictors, safe exploration policies, and automated characterization pipelines.
- Property-conditional and multi-objective inverse design (industry; energy, electronics, catalysis)
- What: Extend Crystalite to condition on target properties (e.g., band gap, ionic conductivity, CO2 adsorption) and constraints (abundance, toxicity).
- Tool/workflow: Reinforcement learning/conditional diffusion wrappers with property predictor surrogates and Pareto-front exploration.
- Assumptions/dependencies: Accurate, differentiable property models; curated labels; methods for constraint satisfaction and uncertainty calibration.
- Scaling to larger, more complex materials classes (academia, industry; MOFs, alloys, defects, surfaces)
- What: Adapt architecture and training to handle larger unit cells, disorder, defects, and non-stoichiometric systems; extend to MOFs and layered materials.
- Tool/workflow: Hierarchical tokenization, mixed representations (Wyckoff/site graphs), and multi-scale GEM for long-range periodicity.
- Assumptions/dependencies: Availability of large, high-quality datasets; handling of symmetry/disorder; memory-efficient training.
- Polymorph and phase map exploration (industry; pharma, mining, electronics)
- What: Systematically generate polymorphs of a given composition to map metastable phases and operating-condition ranges.
- Tool/workflow: “Polymorph explorer” with temperature/pressure-aware scoring and kinetic accessibility heuristics.
- Assumptions/dependencies: Thermodynamics/kinetics models beyond 0 K approximations; domain-specific validation (especially for organics/pharma).
- Natural-language-guided crystal design with LLMs (software, academia; cross-sector)
- What: Combine LLMs that capture domain heuristics with Crystalite as a structured geometric generator for controllable design prompts.
- Tool/workflow: “Chat-to-crystal” agent that translates design intents (e.g., “sulfide fast-ion conductor”) into conditional generation and screening workflows.
- Assumptions/dependencies: Reliable grounding of LLMs, robust interfaces for constraints, and safeguards against hallucinations.
- Supply-chain and criticality-aware materials exploration (policy, industry; sustainability, security)
- What: Embed criticality/cost constraints into generation to prioritize earth-abundant, non-toxic compositions for strategic sectors (batteries, magnets, PV).
- Tool/workflow: Criticality-weighted objectives and filters coupled to public databases (USGS, EC criticality lists).
- Assumptions/dependencies: Up-to-date criticality data; methods to encode scarcity/cost into model objectives without crippling diversity.
- On-device or edge inference for lab instruments (industry; instrumentation)
- What: Deploy pruned/quantized versions of Crystalite to run near real-time candidate generation on instrument-adjacent hardware (e.g., during beam time).
- Tool/workflow: Lightweight inference runtimes with GEM kernels and FlashAttention on small GPUs/NPUs.
- Assumptions/dependencies: Efficient quantization without loss of geometric fidelity; model distillation strategies.
- Safety and governance frameworks for AI-generated materials (policy; standards)
- What: Develop standards for reporting sample budgets, stability proxies, and verification tiers (MLIP, DFT, experiment) in AI-driven discovery claims.
- Tool/workflow: Certification checklists and audit trails for AI-assisted materials nominations in regulated domains.
- Assumptions/dependencies: Community consensus and coordination with journals, funders, and standards bodies.
- Cross-modal integration with experimental constraints (academia, industry; characterization)
- What: Condition generation on partial experimental signals (e.g., lattice constants, space group, partial XRD peaks) for faster structure solution.
- Tool/workflow: Constraint-aware diffusion (project-and-denoise) with space-group and unit-cell priors.
- Assumptions/dependencies: Robust conditioning mechanisms; curated paired datasets (signals ↔ structures).
- Active learning of force fields during generation (academia; software)
- What: Co-train Crystalite with MLIPs by selectively labeling uncertain candidates with DFT, improving both generative realism and stability scoring.
- Tool/workflow: Uncertainty-driven sampler (e.g., ensemble disagreement) orchestrating DFT calls and retraining schedules.
- Assumptions/dependencies: Reliable uncertainty quantification; compute budget for periodic DFT updates.
- Sector-targeted discovery programs (industry; batteries, power electronics, catalysts)
- What: Launch focused campaigns (e.g., solid electrolytes, wide-bandgap oxides/nitrides, oxidation-resistant coatings) using Crystalite-led proposal streams.
- Tool/workflow: Domain-tuned training (data curation, loss weights), property filters, and synthesis playbooks tied to each sector.
- Assumptions/dependencies: Sufficient in-domain training data; validated property models and feasible synthesis routes.
Cross-cutting assumptions and dependencies
- Domain of validity: Demonstrated on inorganic crystalline datasets with up to tens of atoms per cell; generalization to organics/MOFs/disordered systems requires further work.
- Stability evaluation: MLIP-based stability proxies are helpful but not substitutes for DFT/experiment; downstream validation remains essential.
- Data quality and bias: Training data coverage shapes model behavior; novelty may decline as sampling scales without countermeasures (loss balancing, checkpointing).
- Hardware/software: While sampling is lightweight, training remains non-trivial; optimized inference (FlashAttention, mixed precision) improves throughput.
- Experimental translation: Synthesizability, kinetics, and scale-up considerations are not modeled directly and must be incorporated in application workflows.
Glossary
- Adaptive Layer Normalization (AdaLN): A normalization layer whose scale/shift are conditioned on an external signal (here, the noise level) to modulate each block’s activations. Example: "adaptive layer normalization (AdaLN)"
- Anti-annealing (channel-wise): A sampling heuristic that accelerates denoising for specific channels by scaling the reverse-time update more aggressively than standard schedules. Example: "channel-wise anti-annealing"
- Cartesian coordinates: 3D positions in Euclidean space obtained by multiplying fractional coordinates by the lattice matrix. Example: "The corresponding Cartesian coordinates are given by ."
- Cosine-similarity decoding: Mapping a continuous token to a discrete class by choosing the prototype with maximum cosine similarity. Example: "which is equivalent to cosine-similarity decoding because all token vectors are normalized."
- Crystal Structure Prediction (CSP): The task of predicting a crystal’s lattice and atomic positions given its composition. Example: "We evaluate Crystalite in two settings: de novo generation (DNG) and crystal structure prediction (CSP)."
- De novo generation (DNG): Generating full crystal structures (composition, coordinates, lattice) from noise without conditioning on known compositions. Example: "de novo generation (DNG)"
- Density Functional Theory (DFT): An ab initio electronic-structure method used to evaluate stability and properties of materials. Example: "density functional theory (DFT)"
- EDM (Elucidated Diffusion Models): A diffusion modeling framework with specific noise schedules and preconditioning used for training and sampling. Example: "As in EDM, the noisy inputs and raw network outputs are combined"
- Equivariant Graph Neural Networks (GNNs): Neural architectures that preserve symmetry under geometric transformations, commonly used for atomistic systems. Example: "equivariant graph neural networks (GNNs)"
- Exponential Moving Average (EMA): A running average of model parameters that emphasizes recent updates for more stable inference. Example: "maintain an exponential moving average (EMA) of the parameters"
- Fractional coordinates: Atom positions expressed relative to the unit cell, wrapped to the [0,1) interval along each axis. Example: "fractional coordinates "
- Fourier features: Sinusoidal feature mappings that encode periodic structure for downstream neural processing. Example: "via Fourier features"
- Geometry Enhancement Module (GEM): An attention-biasing mechanism that injects periodic minimum‑image pair geometry directly into Transformer attention. Example: "Geometry Enhancement Module (GEM)"
- Heun-style update: A second-order numerical integration step (predictor-corrector) used here to improve diffusion sampling accuracy. Example: "standard Heun-style EDM update"
- Karras schedule: A noise scheduling strategy for diffusion processes that controls step sizes across sampling time. Example: "derived from an auxiliary Karras schedule"
- Latent lattice vector: A 6D unconstrained parameterization that reconstructs a lower-triangular lattice matrix with positive diagonals. Example: "The latent lattice vector "
- Lattice metric: The metric induced by the lattice matrix, used to compute distances under periodicity. Example: "the lattice metric "
- Minimum-image (convention): Selecting the nearest periodic image of an atom pair to compute physically meaningful displacements/distances. Example: "minimum-image fractional displacement"
- MLIP (Machine-Learning Interatomic Potential): Learned surrogate potential used to estimate stability and relax structures efficiently. Example: "MLIP-based stability estimates"
- NequIP: An equivariant neural interatomic potential used for structure relaxation in the evaluation pipeline. Example: "NequIP-based relaxation"
- Niggli-reduced cell: A canonical reduced representation of a crystal lattice that removes basis ambiguity. Example: "Niggli-reduced cell"
- Periodic boundary conditions (PBC): Modeling assumption that the simulation cell repeats infinitely in all directions, enforcing periodicity. Example: "periodic boundary conditions (PBC)"
- Principal Component Analysis (PCA): A linear dimensionality reduction technique used here to compress element descriptors. Example: "using PCA"
- Radial Basis Function (RBF) kernel: A distance-based feature mapping used to encode pairwise distances for the attention bias. Example: "Radial Basis Function (RBF) kernel"
- Riemannian flow matching: A generative modeling approach that defines flows on curved manifolds (e.g., periodic spaces). Example: "extends Riemannian flow matching"
- Subatomic Tokenization: A chemically informed, low-dimensional continuous representation of atom types replacing one‑hot encodings. Example: "Subatomic Tokenization"
- Torus: The manifold representing periodic coordinate spaces (e.g., fractional coordinates modulo 1). Example: "Lie group structure of the torus."
- Unit cell: The fundamental repeating cell of a crystal from which the full lattice is generated by periodic tiling. Example: "unit-cell description of a crystal"
- Wasserstein-based distribution metrics: Measures of distributional alignment using Wasserstein (earth mover’s) distances. Example: "Wasserstein-based distribution metrics"
- Wrapped residual: A difference computed modulo 1 to respect periodicity in fractional coordinate space. Example: "componentwise wrapped residual"
Collections
Sign up for free to add this paper to one or more collections.
