LC-JT-VAE: Ligase-Conditioned Molecular Glue Design
- The paper introduces a ligase-conditioned JT-VAE that integrates protein sequence embeddings with torsion-aware graph features to generate chemically valid molecular glues.
- It employs a dual-encoder/decoder architecture that fuses molecular junction tree and atom-level representations with transformer-based protein embeddings for precise latent-space conditioning.
- Performance evaluations demonstrate enhanced docking specificity, novelty, and ADMET compliance compared to baseline models, highlighting its potential in targeted drug discovery.
The Ligase-Conditioned Junction Tree Variational Autoencoder (LC-JT-VAE) is a generative deep learning model designed to produce small-molecule molecular glues targeting specific E3 ligases, with particular application to targeted protein degradation. LC-JT-VAE extends the original Junction Tree Variational Autoencoder framework by incorporating explicit conditioning on E3 ligase protein sequence embeddings and augmenting the molecular graph representation with torsional (dihedral) angle features. This architecture enables generation of ligase-specific, chemically valid, and novel compounds, facilitating drug discovery for challenging targets such as intracellular amyloid beta-42 (Aβ42) in neurodegenerative disease settings (Islam et al., 26 Jan 2026).
1. Model Architecture
The LC-JT-VAE employs a dual-encoder, dual-decoder design, incorporating protein sequence information and torsion-aware graph features.
- Encoder–Decoder Overview:
The model uses two parallel molecular encoders: a Junction-Tree Message Passing Network (JTMPN) that encodes molecular scaffolds as trees of chemically meaningful cliques, and a Message Passing Network (MPN) that encodes the full atom-level molecular graph, embedding bond-level torsional angles. The protein-conditioning branch encodes E3 ligase binding-site amino-acid sequences via a transformer-based embedding (ProtBERT) followed by a 2-layer bidirectional LSTM (BiLSTM), projecting to a 512-dimensional vector. The molecular and protein embeddings are fused, either by concatenation plus linear projection or by cross-attention, to form a fused latent vector.
- Latent-Space Fusion and Decoding:
The fused latent vector integrates molecular (z_tree_mol ∈ ℝ⁵⁶, z_graph_mol ∈ ℝ⁵⁶) and protein (z_seq ∈ ℝ⁵¹²) features, projected into a common 512-dimensional latent space. Decoding occurs in two steps: (i) junction tree scaffold reconstruction (JTNNDecoder), and (ii) atomic graph reconstruction conditioned on the fused latent vector.
- VAE Objective and Loss:
The loss function is defined as:
where and are negative log-likelihoods for junction tree and graph reconstruction, and is the KL-divergence between the approximate posterior and isotropic Gaussian prior; is linearly annealed from 0 to 1 during training.
2. Torsional Angle–Aware Graph Construction
- Computation and Representation of Torsions:
Rotatable bonds are identified using RDKit routines. For each rotatable bond defined by central atoms – and neighbors , , the dihedral angle is computed in degrees and appended as an edge feature.
- MPN Modifications for Torsion Enhancement:
The torsion angle is represented via a sinusoidal embedding and concatenated to the corresponding bond feature vector. The message passing in the MPN incorporates these enhanced edge features, producing atom-level representations that capture conformational flexibility critical for accurate docking and ligand design.
3. Training Pipeline
- Dataset Construction:
Training employs a dataset of 65,998 ADMET-filtered small molecules (Vitas & ChEMBL source), each docked against Aβ42 complexes with three E3 ligases (CRBN, VHL, MDM2). Stratified splits are used: 80% training, 10% validation, 10% test, balanced by ligase and docking affinity.
- Hyperparameters and Optimization:
The Adam optimizer is used (initial learning rate 1×10⁻³, exponential decay γ=0.9). Batch size is 50; gradient clipping (max norm=50) and Xavier normal initialization for weights are applied. Early stopping monitors validation total loss with a patience threshold of 50 epochs; model checkpoints are saved every 10 epochs. Latent spaces have dimensions for molecular encodings and for protein embeddings; the fused latent is before linear projection to .
- Training Protocol:
The β weight in the VAE loss is annealed from 0 to 1 at increments of 0.002 per batch, supporting stable variational regularization. Total epoch budget is 500 or to convergence.
4. Performance Evaluation
- Generative Metrics:
For each ligase (CRBN, VHL, MDM2), 1,000 molecules are sampled from the model. Evaluation metrics include chemical validity (RDKit-parseable), uniqueness (fraction of unique molecules), novelty (not present in training set), QED, and Lipinski’s Rule-of-Five adherence (≤1 violation).
| Ligase | Validity | Uniqueness | Novelty | Avg QED | Lipinski |
|---|---|---|---|---|---|
| VHL | 96.3% | 85.2% | 81.5% | 0.42 | 83% |
| MDM2 | 97.9% | 97.9% | 93.6% | 0.40 | 70% |
| CRBN | 100.0% | 82.0% | 80.0% | 0.35 | 60% |
- Target-Specific Docking and Baseline Comparison:
Generated compounds exhibit favorable docking to their intended ligase vs. off-targets (target mean ≈ –5.8 kcal/mol; off-targets ≈ –4.1 kcal/mol). The unconditional JT-VAE baseline (no ligase conditioning or torsions) yields inferior novelty (~60%) and reduced selectivity (docking ~–4.5 kcal/mol).
- Generated Molecular Glue Examples:
- VHL_Cmpd_4: Substituted aromatic ring + amide linker + cyclic amine; docking –5.98 kcal/mol to VHL/Aβ42; H-bonds with HIS13 (Aβ42) and HIS110 (VHL).
- CRBN_Cmpd_3: Phthalimide core + ether linker; docking –5.80 kcal/mol; H-bonds to HIS355, TRP402.
- MDM2_Cmpd_5: Biphenyl scaffold + polar side chain; docking –5.78 kcal/mol; π–π stacking with PHE19, H-bond to GLU23.
5. Algorithmic Workflow
The LC-JT-VAE workflow covers both training and sampling phases, as follows:
- Training Phase:
- Preprocess molecule and ligase: RDKit decomposition (junction tree, graph, torsions), ProtBERT + BiLSTM ligase encoding, projection to ℝ⁵¹².
- Encode molecule: JTMPN for T, MPN for G with torsions.
- Fuse molecular and ligase embeddings; apply ReLU and linear projection.
- Derive mean and variance for latent z; sample from the latent space.
- Decode via JTNNDecoder and GraphDecoder; reconstruct molecule.
- Compute total loss; backpropagate and update parameters.
- Sampling Phase:
- Select target ligase; obtain sequence embedding.
- Sample latent vectors for tree and graph from standard normal.
- Fuse with the ligase embedding; apply decoder to generate molecular structure.
- Convert generated molecular graph to SMILES via RDKit.
6. Context and Significance
LC-JT-VAE, as introduced by Islam & Caulfield (2025) (Islam et al., 26 Jan 2026), represents an advance in molecular generative modeling for conditional drug design, integrating explicit protein-target information and conformational chemical features. This approach addresses limitations in traditional ligand generation by enabling direct conditioning on E3 ligase binding-site sequences, relevant for molecular glue discovery and UPS-targeted therapies. Results on ADMET-screened compound libraries and docking assessments underscore the framework’s ability to generate synthesizable, ligase-selective small molecules. A plausible implication is the potential for extension to additional protein targets and chemical tasks requiring fine-grained control over molecule-protein interactions.