PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction (2412.18541v1)
Abstract: Recent advances in topology-based modeling have accelerated progress in physical modeling and molecular studies, including applications to protein-ligand binding affinity. In this work, we introduce the Persistent Laplacian Decision Tree (PLD-Tree), a novel method designed to address the challenging task of predicting protein-protein interaction (PPI) affinities. PLD-Tree focuses on protein chains at binding interfaces and employs the persistent Laplacian to capture topological invariants reflecting critical inter-protein interactions. These topological descriptors, derived from persistent homology, are further enhanced by incorporating evolutionary scale modeling (ESM) from a LLM to integrate sequence-based information. We validate PLD-Tree on two benchmark datasets-PDBbind V2020 and SKEMPI v2 demonstrating a correlation coefficient ($R_p$) of 0.83 under the sophisticated leave-out-protein-out cross-validation. Notably, our approach outperforms all reported state-of-the-art methods on these datasets. These results underscore the power of integrating machine learning techniques with topology-based descriptors for molecular docking and virtual screening, providing a robust and accurate framework for predicting protein-protein binding affinities.