PepTune: Multi-Objective Peptide Optimization
- PepTune is a framework for de novo generation of therapeutic peptide SMILES using a Masked Discrete Language Model combined with discrete diffusion to capture sequence features.
- It employs a bond-dependent masking schedule and an invalid SMILES loss to enforce chemical validity and ensure correct peptide backbone assembly.
- PepTune leverages classifier-guided Monte Carlo Tree Guidance to optimize multiple properties simultaneously, achieving 100% validity and Pareto-optimal peptide candidates.
PepTune is a framework for the de novo generation and multi-objective optimization of therapeutic peptide SMILES sequences. It integrates a Masked Discrete LLM (MDLM) with a Monte Carlo Tree Guidance (MCTG) inference procedure, implementing a discrete diffusion process tailored to chemical validity and multi-property optimization. Peptide candidates generated via PepTune are jointly optimized for properties including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling, with the system leveraging separate classifier-based scorers and a Pareto front to enforce multi-objective excellence. PepTune enforces sequence validity, supports chemically modified peptides, and produces a diverse set of non-dominated, therapeutically promising peptides in a single unified workflow (Tang et al., 2024).
1. Masked Discrete Diffusion Model (MDLM) Foundation
PepTune employs a Masked Discrete LLM operating on peptide SMILES, modeling peptide sequences as tokens over a vocabulary (including a special [MASK] token). The core generative process is defined over continuous-time diffusion in discrete sequence space by progressively masking tokens during the forward process and training a denoising model for reverse 'unmasking'. For each token, at time %%%%1%%%%, the transition is given by
where is the position- and token-type-sensitive masking schedule.
The denoising model, a RoFormer backbone , predicts original tokens at each unmasking step. The reverse transition enforces SUbstitute-then-SUBstitute (SUBS) constraints: once a token is unmasked, it remains so; tokens cannot revert to [MASK]. This is implemented with a MaskedSoftmax whose masking logic reflects these constraints, ensuring model predictions maintain causality and sequence integrity.
The model minimizes a continuous-time negative evidence lower bound (NELBO):
with special position-wise weights to emphasize critical peptide-bond tokens.
2. Bond-Dependent Masking Schedule
PepTune introduces a bond-type-dependent masking rate to enforce correct peptide backbone assembly. The masking function differentiates between peptide-bond and non-bond tokens:
with (set to in practice). This ensures that peptide-bond tokens are masked at a much slower rate during the forward process, requiring the model to prioritize backbone reconstruction (and hence chemical validity) when learning to 'denoise'. The model is further pushed to restore backbone topology first by increasing token loss weight for bond positions, while other tokens are weighted $1/t$.
3. Enforcement of Chemical Validity and Invalid SMILES Loss
To further ensure outputs are syntactically valid peptide SMILES, PepTune incorporates a global invalid-SMILES loss. During greedy sampling, if the output string is not a syntactically valid peptide SMILES, a penalty is added:
where indicates invalidity, and is the softmax output at position . The gradient is backpropagated through softmax, bypassing the non-differentiable argmax. This loss down-weights logits yielding invalid structures and up-weights alternatives, directly penalizing chemical invalidity at each output step.
4. Multi-Objective Monte Carlo Tree Guidance (MCTG) for Sequence Optimization
At inference, PepTune applies Monte Carlo Tree Guidance (MCTG), an adaptation of Monte Carlo Tree Search (MCTS), to steer generation toward Pareto-optimal, property-optimized peptides. This procedure proceeds as follows:
a) Multi-Objective Reward Definition
For properties, each with predictor , a growing Pareto set is maintained. For each new peptide , the reward vector is
providing a normalized count of the property-wise dominance of over the Pareto set.
b) Tree Traversal and Expansion
- Selection: From the root , child nodes are chosen by normalized reward . Only non-dominated children (in the reward vector sense) are considered, ties are resolved randomly.
- Expansion: At each leaf node , possible one-step unmaskings are sampled using batched Gumbel-Max on the model's transition probabilities.
c) Rollout, Evaluation, and Backpropagation
Each expanded child is greedily unmasked to obtain a full peptide SMILES, property scores , and rewards . The Pareto set is updated to contain all non-dominated peptides. Accrued rewards are backpropagated along the traversal path, updating counts and sums .
d) Summary Table: MCTG Components
| Component | Description | Mathematical Formulation / Definition |
|---|---|---|
| Reward Vector | Dominance over Pareto front | |
| Node Value | Reward normalization in tree | |
| Expansion Mechanism | child unmaskings by Gumbel-Max | Sample from |
5. Classifier-Based, Gradient-Free Objective Integration
Unlike differentiable guidance schemes, PepTune employs classifiers solely for scoring fully unmasked peptide sequences during MCTS rollout; no gradients with respect to model input are computed. Only discrete, vector-valued rewards steer the optimization, facilitating handling of even conflicting objectives without conflating classifier gradients or requiring continuous relaxation of the underlying discrete space. This modularity enables property plug-in for arbitrary sets of independent sequence properties.
6. Experimental Results and Performance Characteristics
PepTune demonstrates marked improvements on multiple generation quality and optimization metrics:
- Unconditional validity: The base MDLM achieves 45% valid peptides at sequence length 100; following MCTS-guided inference, validity rises to 100%.
- Generative quality (Moses-style statistics): Output uniqueness (1.00), diversity (0.68), and low SNN (0.49), closely mirroring training data.
- Multi-Target Optimization: For dual-target (TfR + GLAST) peptides, top docking scores reach –10.5 kcal/mol (TfR) and –9.2 kcal/mol (GLAST) versus –8.4 kcal/mol for a single-target T7 benchmark.
- Case Studies: PepTune generates sub-30-mer peptides binding GLP-1R at –7.4 kcal/mol (compared to –5.7/–5.1 for semaglutide/liraglutide), and cyclic/non-natural GFAP binders at –8.5 kcal/mol.
- Pareto Front Construction: In approximately 20 inference iterations, a diverse front of ~100 non-dominated peptides optimized along five objectives is produced.
A plausible implication is that PepTune's combined discrete diffusion and MCTG inference enables efficient exploration and exploitation of the combinatorial peptide space for multi-property design, while always enforcing strict chemical validity.
7. Workflow Summary and Algorithmic Overview
The entire process of PepTune is encapsulated in its MCTS-guided discrete diffusion pseudocode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
Inputs: MDLM denoiser p_θ(z_{t–1} | z_t),
K property scorers s₁,…,s_K,
time steps T, iterations I, children M.
Initialize: root node z_T = [MASK]^L;
Pareto set Π* = ∅;
visit counts N(·)=0, reward sums W(·)=0.
for i in 1…I:
# 1) SELECTION
node ← root
while node has MCTS-children:
for c in children(node):
U[c] = W[c]/N(c) # normalized vector reward
node ← UniformRandom(non-dominated U[c])
# 2) EXPANSION
Expand node by sampling M children {z_{s,j} via Gumbel-Max on p_θ(·|node).
# 3) ROLLOUT & EVALUATION
total_r = 0 (vector in ℝ^K)
for each child z_s,j:
x_j ← GreedyUnmaskChain(z_s,j until t=0)
s_vec = [s₁(x_j),…,s_K(x_j)]
r_vec = ComputeReward(s_vec, Π*) # eq. (7)
UpdatePareto(Π*, x_j, s_vec)
total_r += r_vec
# 4) BACKPROPAGATION
anc = node
while anc ≠ None:
W[anc] += total_r
N(anc) += 1
anc ← parent(anc)
return Π* # Pareto-optimal peptides |
Taken together, PepTune unifies masked discrete diffusion, chemically aware masking and invalidity handling, and MCTG-based multi-objective optimization to deliver a modular solution for the generation of valid, Pareto-optimal peptide therapeutics (Tang et al., 2024).