Peripheral Surface Information Entropy
- Peripheral Surface Information (PSI) entropy is a thermoinformatic descriptor that measures the statistical variability of a protein’s non-interacting surface to assess binding specificity.
- It employs molecular docking and molecular dynamics to generate conformational ensembles and computes a normalized Shannon entropy reflecting peripheral residue patterns.
- The metric distinguishes specific from non-specific interactions and offers insights for enhancing machine-learning models and designing targeted mutations.
Peripheral Surface Information (PSI) entropy is a thermoinformatic descriptor designed to quantify the statistical variability of a protein’s non-interacting surface (NIS) in relation to binding specificity. PSI entropy captures how the ensemble-level diversity of apolar and charged residue exposure on the periphery of a receptor protein correlates with favorable, focused protein-peptide interactions. By leveraging conformational ensembles generated via molecular docking and molecular dynamics (MD), this framework reveals emergent, low-entropy surface patterns that distinguish cognate from non-cognate binders, and proposes a bridge between peripheral surface architecture, energetic specificity, and evolutionary selection (Grear et al., 31 Jan 2026).
1. Non-Interacting Surface (NIS): Definition and Chemical Classification
In a protein–peptide complex, the non-interacting surface (NIS) is defined as the set of receptor residues that are both solvent-exposed (relative solvent accessibility, RSA ) and located at least 5 Å (heavy-atom distance) from any peptide atom. Residues comprising the interface (buried or within 5 Å) are excluded from the NIS analysis. Each NIS residue is further classified by side-chain type:
| Class | Residue Set | Label |
|---|---|---|
| Apolar | A, V, I, L, M, F, W, Y | A |
| Charged | D, E, K, R | C |
| Polar | All other uncharged, polar residues | P |
The NIS is described for each microstate (conformation) by the count tuple , where is the number of apolar residues, charged, and polar. Because the total number of NIS residues per complex is fixed, specifying uniquely indexes a macrostate of peripheral chemical composition.
2. Mathematical Formulation of PSI Entropy
Given a conformational ensemble of microstates, the distribution of NIS macrostates is quantified by enumerating the frequency of each tuple:
- Empirical probability: , with .
A base diversity metric is the Shannon entropy over :
To adjust for energetic specificity, is normalized by an interface-focused factor . For all inter-chain residue–residue contacts :
- Assign unnormalized “mass” proportional to contact probability.
- Apply a chemical pair weighting: , where (see Supplement Table S2 in (Grear et al., 31 Jan 2026)).
- Compute total contact mass and total distinct pairs .
Define , and the normalized PSI entropy:
This normalization ensures reflects the diversity of peripheral chemical surface patterns per unit of favorable interface contact mass. Low values indicate strong, recurring NIS patterns accompanying energetically focused interfaces; high values indicate diffuse or polyspecific interactions.
3. Computational Workflow
The PSI entropy workflow is as follows:
- Docking and MD Ensemble Generation:
- Rigid-body docking with HADDOCK3 (~3,000 poses) is followed by semi-flexible refinement and explicit-solvent MD, yielding hundreds of microstates per complex.
- NIS Macrostate Assignment:
- For each microstate:
- Compute RSA per residue.
- Identify NIS residues (RSA , Å from peptide).
- Classify each as A, C, or P; count .
- Assign macrostate using .
- For each microstate:
- Probability Estimation and Contact Statistics:
- Tabulate , compute .
- For each ensemble, determine , apply , sum to , count .
- Entropy Calculation:
- Insert all values into the normalized formula for .
4. Patterns of PSI Entropy in Cognate and Non-Cognate Complexes
Examination of numerous protein–peptide systems demonstrates the discriminative power of PSI entropy:
- WW–Smad7 (PDB 2LTW): 227 microstates collapse to macrostates; a dominant mode at indicates peripheral compositional focusing.
- Robustness to Parameters: Across alternate initial peptide conformations (NMR, AF2, AF3), sublinear scaling of macrostate numbers is accompanied by a persistent mode, indicating insensitivity to sampling details.
- Cognate vs. Random Decoys: For the PPxY WW–domain system, cognate binders yield versus for random decoys, even when unnormalized diversity is similar, underscoring the effect of rescaling.
- Cross-System Specificity: Docking cognate peptides to MDM2 (4HFZ) and PDZ (1ZUB) gives reductions of 21.6% and 42.8%, respectively, over non-cognate binders. Lower is consistently associated with favorable binding and dense contact maps, exemplifying “Regime I”.
5. Cross-System and Experimental Meta-Ensemble Analysis
The robustness and biological relevance of PSI entropy are supported by both cross-system in silico and experimental data:
- System Generality: Across MDM2 and PDZ domains, the descriptor consistently distinguishes cognate from non-cognate complexes under uniform workflow parameters.
- Experimental Validation: An aggregate of 36 high-resolution WW-domain structures (34 NMR, 2 X-ray; microstates) reveals a dominant NIS fingerprint, with 234 discrete macrostates (103 singletons, maximal occupancy 20). The effective macrostate fraction, , indicates that about 25% of peripheral patterns dominate, qualitatively supporting an evolutionary preference for select NIS organizations.
6. Applications, Limitations, and Future Directions
PSI entropy serves as an ensemble-level readout of macromolecular recognition distinct from interface-centric metrics:
- Applications:
- Integration of into machine-learning scoring functions to enhance binder/discriminator models.
- Time-resolved monitoring of NIS-state trajectories—for instance, to study allosteric coupling or responses to environmental perturbations (e.g., osmotic stress).
- “Anti-directed” mutation design strategies that increase to destabilize unwanted or aberrant complexes by suppressing dominant NIS modes.
- Limitations:
- Results depend on docking/MD sampling depth, choice of RSA and distance cutoffs, and the tripartite residue coarsening (A, C, P).
- Prospects:
- Refinement through finer chemical partitioning or use of continuous properties (e.g., electrostatic potential).
- Extension to protein–protein and protein–nucleic acid complexes.
- Dynamical analysis of to probe transition pathways in NIS space.
PSI entropy thus offers a quantitative and thermoinformatic perspective on the interplay between peripheral surface organization, interface energetics, and evolutionary selection, supplementing conventional metrics of biomolecular recognition (Grear et al., 31 Jan 2026).