Patterson Maps: Crystallography & Deep Learning
- Patterson maps are 3D real-space functions encoding all interatomic vectors via the inverse Fourier transform of squared structure factors, making them vital for structural inference.
- They leverage diffraction intensities to address the phase problem while managing challenges like centrosymmetry and translation invariance inherent in crystallographic data.
- Integration with deep learning architectures such as CrysFormer and RecCrysFormer significantly improves phase error metrics and correlation, advancing macromolecular structure determination.
A Patterson map is a three-dimensional real-space function fundamental to X-ray crystallography. It encodes information on all interatomic vectors in a crystal by utilizing solely the measured diffraction intensities, thereby circumventing the loss of phase information known as the phase problem. The computational construction of Patterson maps, their use for structural inference, and their integration into contemporary deep learning frameworks such as CrysFormer and RecCrysFormer, form a cornerstone of recent advances in direct utilization of experimental data for macromolecular structure determination (Pan et al., 13 Nov 2025, Dun et al., 2023, Pan et al., 28 Feb 2025, Hurwitz, 2020).
1. Mathematical Definition and Foundational Properties
Given a crystal of unit-cell volume and structure factors (where are Miller indices), only the squared amplitudes , derived from measured intensities , are experimentally accessible. The Patterson function, , is defined as
where labels points in fractional coordinates.
Equivalently, in Fourier space notation, the Patterson map is the inverse Fourier transform of the squared structure factor amplitudes, i.e., . In real space, the Patterson map is the autocorrelation of the electron density, . Each peak in corresponds to a vector between two atoms, and the number of peaks scales as (for atoms). The absence of phase information causes ambiguity in reconstructing absolute atomic positions, and the map is centrosymmetric and translation invariant (Pan et al., 13 Nov 2025, Dun et al., 2023, Hurwitz, 2020).
2. Physical Significance and Implications for the Phase Problem
The Patterson map leverages the fact that all interatomic vectors manifest as peaks in , so it retains complete information about pairwise atomic separations. However, lacking phase information, loses any knowledge of the absolute atomic positions, is invariant under global translation, and is centrosymmetric. The direct interpretability of Patterson maps is limited for macromolecular crystals: peak overlap and dominance of signals from heavier atoms hinder resolution. However, these maps are directly computable from experimental data, making them attractive for direct integration with computational and machine learning workflows (Pan et al., 13 Nov 2025, Hurwitz, 2020).
3. Computation and Representation of Patterson Maps
Experimental computation begins with structure factor amplitudes calculated as the square root of measured intensities, . Applying an inverse discrete Fourier transform to with the inclusion of crystallographic symmetry (e.g., Friedel’s law), the Patterson map is generated, typically using dedicated crystallographic FFT utilities such as CCP4's fft. The map is discretized on a Cartesian grid, usually with isotropic voxel spacing near 0.5 Å, and normalized to a standardized range (such as ) for machine learning compatibility. Recent approaches center the input and partial structure maps at the cell’s center of mass, eliminating translational ambiguity (Pan et al., 13 Nov 2025, Pan et al., 28 Feb 2025, Dun et al., 2023).
Table 1: Summary of Patterson Map Construction Steps
| Input Data | Processing Procedure | Output Format |
|---|---|---|
| Diffraction intensities | Amplitudes: | Squared amplitudes grid |
| Squared amplitudes | Inverse DFT (e.g., CCP4 fft), normalized by |
3D Patterson map grid |
| Partial structure templates | Structure factor calculation, FFT to density | 3D partial density grid |
4. Symmetry, Ambiguity, and Conflict Removal in Machine Learning
Patterson maps possess inherent symmetries: translation invariance, centrosymmetry, and origin ambiguity for vectors. For effective machine learning, training data must uniquely pair each map to an output, avoiding conflicts:
- Translation invariance is addressed by centering coordinates.
- Centrosymmetric inversion is managed by simultaneously presenting both original and inverted atomic densities as the output.
- Vector-origin ambiguity (especially for cubic grids) is handled by spatially confining outputs: all atoms are restricted to a centered sub-box with interatomic distances less than half the grid size, ensuring assignment of peaks to vector origins is unique (Hurwitz, 2020).
Failure to enforce these uniqueness constraints leads to poor convergence or generalization in neural networks trained from Patterson maps (Hurwitz, 2020).
5. Integration into Deep Learning Architectures
The CrysFormer and RecCrysFormer frameworks are representative state-of-the-art models that use Patterson maps as primary input channels. Patterson maps and partial structure templates (e.g., incomplete AlphaFold2 models or standardized residue densities) are encoded as 3D tensors. A 3D CNN "stem" processes these tensors and feeds embedded patches into a transformer or cross-modal attention core. Predictions comprise volumetric electron density maps, which can be crystallographically refined into atomic models (Pan et al., 13 Nov 2025, Dun et al., 2023, Pan et al., 28 Feb 2025).
Key architectural features include:
- Patch-based tokenization of map volumes.
- One-way cross-modal attention allowing Patterson tokens to attend to partial-structure tokens, but not vice versa.
- CNN-based reconstruction heads for dense output generation.
- Loss functions that combine mean squared error and (optionally) negative Pearson correlation in real or Fourier space.
Novel regimen such as "recycling" incorporates output predictions or refined maps as templates for subsequent learning iterations, improving phase recovery and robustness (Pan et al., 28 Feb 2025).
6. Empirical Performance and Impact
Quantitative evaluation demonstrates that deep learning models utilizing Patterson maps surpass traditional refinement pipelines on test data:
- CrysFormer achieves an unweighted average phase error of 39.4° (σ=10.3) vs. 48.7° (σ=10.5) for classical sigmaA PARTIAL post-processing. FOM-weighted phase error is reduced to 30.2° (σ=8.4).
- Pearson correlation of predicted structure factors with ground truth reaches 0.867, versus 0.816 for baseline. On challenging cases, CrysFormer improves both phase error and correlation even when partial templates are substantially inaccurate (Pan et al., 13 Nov 2025).
- RecCrysFormer, with recycling, yields test set Pearson correlations up to ≈0.93, and mean phase error decreases from ~64° initially to ~20°. Fraction of test cases that are successfully refined increases from 79% to 93% after recycling (Pan et al., 28 Feb 2025).
- Ablation and resolution-shell analysis confirm the critical role of Patterson maps: omitting them has a large adverse effect, particularly at higher resolution shells. One-way attention from partial structures further improves density recovery accuracy (Dun et al., 2023).
- Application of standard model building to predicted maps, e.g., PHENIX AutoBuild and SHELXE, results in successful atomic models (R-factor < 0.38) in the overwhelming majority of cases (Dun et al., 2023).
7. Limitations, Scalability, and Future Directions
Despite substantial progress, current methodologies are conditioned by several constraints:
- Traditional use of Patterson maps is limited by map crowding and overlap in large macromolecular structures.
- Uniqueness constraints, essential for learning, become more complex under full crystallographic symmetry.
- The integration of real-space electron densities with symmetry-adapted Patterson maps will require further architectural and algorithmic advances as models scale to larger and more complex structures (Pan et al., 13 Nov 2025, Hurwitz, 2020).
Ongoing directions include hybridization with established phasing techniques, incorporation of space group symmetry directly into neural architectures, and continued improvement in robustness to experimental noise and non-ideal experimental conditions.
Patterson maps have been transformed from a classic crystallographic construct into a high-fidelity conduit between experimental data and predictive machine learning models, enabling direct experimental constraints to drive atomic structure recovery. Their continued development as an interface for both conventional and AI-driven pipelines remains a crucial area in structural biology (Pan et al., 13 Nov 2025, Pan et al., 28 Feb 2025, Dun et al., 2023, Hurwitz, 2020).