PepTune: Multi-Objective Peptide Optimization

Updated 25 February 2026

PepTune is a framework for de novo generation of therapeutic peptide SMILES using a Masked Discrete Language Model combined with discrete diffusion to capture sequence features.
It employs a bond-dependent masking schedule and an invalid SMILES loss to enforce chemical validity and ensure correct peptide backbone assembly.
PepTune leverages classifier-guided Monte Carlo Tree Guidance to optimize multiple properties simultaneously, achieving 100% validity and Pareto-optimal peptide candidates.

PepTune is a framework for the de novo generation and multi-objective optimization of therapeutic peptide SMILES sequences. It integrates a Masked Discrete LLM (MDLM) with a Monte Carlo Tree Guidance (MCTG) inference procedure, implementing a discrete diffusion process tailored to chemical validity and multi-property optimization. Peptide candidates generated via PepTune are jointly optimized for properties including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling, with the system leveraging separate classifier-based scorers and a Pareto front to enforce multi-objective excellence. PepTune enforces sequence validity, supports chemically modified peptides, and produces a diverse set of non-dominated, therapeutically promising peptides in a single unified workflow (Tang et al., 2024).

1. Masked Discrete Diffusion Model (MDLM) Foundation

PepTune employs a Masked Discrete LLM operating on peptide SMILES, modeling peptide sequences as tokens over a vocabulary $\mathcal{V}$ (including a special [MASK] token). The core generative process is defined over continuous-time diffusion in discrete sequence space by progressively masking tokens during the forward process and training a denoising model for reverse 'unmasking'. For each token, at time $t \in [0,1]$ , the transition is given by

$q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$

where $\alpha_t(x_0)$ is the position- and token-type-sensitive masking schedule.

The denoising model, a RoFormer backbone $x_\theta(z_t, t) \in \Delta^{|\mathcal{V}|}$ , predicts original tokens at each unmasking step. The reverse transition enforces SUbstitute-then-SUBstitute (SUBS) constraints: once a token is unmasked, it remains so; tokens cannot revert to [MASK]. This is implemented with a MaskedSoftmax whose masking logic reflects these constraints, ensuring model predictions maintain causality and sequence integrity.

The model minimizes a continuous-time negative evidence lower bound (NELBO):

$\mathcal{L}_\mathrm{NELBO}^\infty = \mathbb{E}_{t \sim U(0,1), q(z_t|x_0)}\left[ -\sum_{\ell=1}^L w_\beta( x_0^{(\ell)} ) \frac{1}{t}\log \langle x_0^{(\ell)}, x_\theta^{(\ell)}(z_t, t) \rangle \right]$

with special position-wise weights $w_\beta$ to emphasize critical peptide-bond tokens.

2. Bond-Dependent Masking Schedule

PepTune introduces a bond-type-dependent masking rate to enforce correct peptide backbone assembly. The masking function differentiates between peptide-bond and non-bond tokens:

$\alpha_t(x_0) = \begin{cases} 1 - t^w, & x_0 \in \{\text{peptide-bond tokens}\} \ 1 - t, & \text{otherwise} \end{cases}$

with $w > 1$ (set to $w=3$ in practice). This ensures that peptide-bond tokens are masked at a much slower rate during the forward process, requiring the model to prioritize backbone reconstruction (and hence chemical validity) when learning to 'denoise'. The model is further pushed to restore backbone topology first by increasing token loss weight $t \in [0,1]$ 0 for bond positions, while other tokens are weighted $t \in [0,1]$ 1.

3. Enforcement of Chemical Validity and Invalid SMILES Loss

To further ensure outputs are syntactically valid peptide SMILES, PepTune incorporates a global invalid-SMILES loss. During greedy sampling, if the output string is not a syntactically valid peptide SMILES, a penalty is added:

$t \in [0,1]$ 2

where $t \in [0,1]$ 3 indicates invalidity, and $t \in [0,1]$ 4 is the softmax output at position $t \in [0,1]$ 5. The gradient is backpropagated through softmax, bypassing the non-differentiable argmax. This loss down-weights logits yielding invalid structures and up-weights alternatives, directly penalizing chemical invalidity at each output step.

4. Multi-Objective Monte Carlo Tree Guidance (MCTG) for Sequence Optimization

At inference, PepTune applies Monte Carlo Tree Guidance (MCTG), an adaptation of Monte Carlo Tree Search (MCTS), to steer generation toward Pareto-optimal, property-optimized peptides. This procedure proceeds as follows:

a) Multi-Objective Reward Definition

For $t \in [0,1]$ 6 properties, each with predictor $t \in [0,1]$ 7, a growing Pareto set $t \in [0,1]$ 8 is maintained. For each new peptide $t \in [0,1]$ 9, the reward vector is

$q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 0

providing a normalized count of the property-wise dominance of $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 1 over the Pareto set.

b) Tree Traversal and Expansion

Selection: From the root $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 2, child nodes are chosen by normalized reward $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 3. Only non-dominated children (in the reward vector sense) are considered, ties are resolved randomly.
Expansion: At each leaf node $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 4, $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 5 possible one-step unmaskings are sampled using batched Gumbel-Max on the model's transition probabilities.

c) Rollout, Evaluation, and Backpropagation

Each expanded child is greedily unmasked to obtain a full peptide SMILES, property scores $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 6, and rewards $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 7. The Pareto set $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 8 is updated to contain all non-dominated peptides. Accrued rewards are backpropagated along the traversal path, updating counts $q(z_t | x_0) = \mathrm{Cat}\left( z_t; \alpha_t(x_0) x_0 + [1 - \alpha_t(x_0)] m \right)$ 9 and sums $\alpha_t(x_0)$ 0.

d) Summary Table: MCTG Components

Component	Description	Mathematical Formulation / Definition
Reward Vector $\alpha_t(x_0)$ 1	Dominance over Pareto front	$\alpha_t(x_0)$ 2
Node Value $\alpha_t(x_0)$ 3	Reward normalization in tree	$\alpha_t(x_0)$ 4
Expansion Mechanism	$\alpha_t(x_0)$ 5 child unmaskings by Gumbel-Max	Sample from $\alpha_t(x_0)$ 6

5. Classifier-Based, Gradient-Free Objective Integration

Unlike differentiable guidance schemes, PepTune employs classifiers solely for scoring fully unmasked peptide sequences during MCTS rollout; no gradients $\alpha_t(x_0)$ 7 with respect to model input are computed. Only discrete, vector-valued rewards $\alpha_t(x_0)$ 8 steer the optimization, facilitating handling of even conflicting objectives without conflating classifier gradients or requiring continuous relaxation of the underlying discrete space. This modularity enables property plug-in for arbitrary sets of independent sequence properties.

6. Experimental Results and Performance Characteristics

PepTune demonstrates marked improvements on multiple generation quality and optimization metrics:

Unconditional validity: The base MDLM achieves 45% valid peptides at sequence length 100; following MCTS-guided inference, validity rises to 100%.
Generative quality (Moses-style statistics): Output uniqueness (1.00), diversity (0.68), and low SNN (0.49), closely mirroring training data.
Multi-Target Optimization: For dual-target (TfR + GLAST) peptides, top docking scores reach –10.5 kcal/mol (TfR) and –9.2 kcal/mol (GLAST) versus –8.4 kcal/mol for a single-target T7 benchmark.
Case Studies: PepTune generates sub-30-mer peptides binding GLP-1R at –7.4 kcal/mol (compared to –5.7/–5.1 for semaglutide/liraglutide), and cyclic/non-natural GFAP binders at –8.5 kcal/mol.
Pareto Front Construction: In approximately 20 inference iterations, a diverse front of ~100 non-dominated peptides optimized along five objectives is produced.

A plausible implication is that PepTune's combined discrete diffusion and MCTG inference enables efficient exploration and exploitation of the combinatorial peptide space for multi-property design, while always enforcing strict chemical validity.

7. Workflow Summary and Algorithmic Overview

The entire process of PepTune is encapsulated in its MCTS-guided discrete diffusion pseudocode:

$\alpha_t(x_0)$ 9

Taken together, PepTune unifies masked discrete diffusion, chemically aware masking and invalidity handling, and MCTG-based multi-objective optimization to deliver a modular solution for the generation of valid, Pareto-optimal peptide therapeutics (Tang et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PepTune.