Residue-Atom Bi-scale Attention (RBA)

Updated 3 February 2026

The paper demonstrates that RBA is a bipartite attention mechanism capturing both intra-residue coherence and residue–atom interactions for precise enzyme pocket design.
RBA leverages SE(3)-aware feature representations and RBF distance embedding to update protein features and 3D coordinates in a physically realistic manner.
Empirical results show that incorporating both attention streams leads to significant improvements in protein structure accuracy and binding specificity.

Residue-atom Bi-scale Attention (RBA) is a bipartite attention mechanism introduced in EnzyPGM for substrate-specific enzyme design, targeting the joint generation and refinement of enzyme binding pockets and their interaction with substrate molecules. RBA is specifically designed to capture both local intra-residue dependencies critical for pocket coherence and fine-grained residue–substrate atom interactions necessary for precise catalytic function. The module operates over SE(3)-aware feature and coordinate representations, systematically updating both the high-level protein context and atomic-scale geometry, thus unifying structure and function at multiple spatial scales (Lin et al., 27 Jan 2026).

1. Conceptual Foundation and Goals

RBA is motivated by the dual requirements of functional protein pocket design: (1) maintaining physically coherent, spatially organized residue networks, and (2) enabling atomically precise recognition of diverse substrate molecules via their constituent atoms. Classical protein generative models generally fail to account for these two scales simultaneously; they either focus on sequence/structure at the residue or coarse backbone level, or model ligand–protein interaction in a post hoc or separate fashion. RBA directly addresses this with two interlinked streams of attention:

Intra-residue attention: Propagates information within local networks of pocket residues, utilizing distance-aware message passing to ensure geometric and environmental consistency.
Residue–atom (cross-modal) attention: Facilitates detailed, directional communication between pocket residues and substrate atoms within a spatial neighborhood, supporting context-sensitive binding and catalytic specificity.

This bi-scale architecture underpins EnzyPGM's capability to learn functionally and structurally accurate pockets in a differentiable and physically realistic manner (Lin et al., 27 Jan 2026).

2. Input Representations and Neighborhood Construction

RBA operates on two parallel sets of representations:

Enzyme pocket residues: Each residue $i$ is described by a feature vector $\tilde{\mathbf{h}}_i \in \mathbb{R}^{d_h}$ and a coordinate $\tilde{x}_i \in \mathbb{R}^3$ .
Substrate atoms: Each atom $j$ has a feature $\mathbf{z}_j \in \mathbb{R}^{d_h}$ and a coordinate $y_j \in \mathbb{R}^3$ .

Neighborhoods are defined using either nearest-neighbor search (k-NN) or distance thresholds (typically $d=10$ Å):

$\mathcal{N}_r(i)$ : Residues $k$ spatially proximal to $i$ .
$\mathcal{N}_s(i)$ : Substrate atoms $j$ within threshold distance of $\tilde{x}_i$ .

This spatial bias in graph construction ensures computational focus and physical relevance, as only physically plausible interactions are considered at each step.

3. Mathematical Mechanics of Bi-scale Attention

3.1. Radial Basis Function (RBF) Distance Embedding

Distances $r=\|\tilde{x}_i - \tilde{x}_k\|_2$ (residue–residue) and $r=\|\tilde{x}_i - y_j\|_2$ (residue–atom) are embedded into $\mathbb{R}^{d_{rbf}}$ using fixed or learnable Gaussian basis functions:

$\phi_{rbf}(r) \in \mathbb{R}^{d_{rbf}}$

3.2. Intra-residue Attention

Queries, keys, and values for residues: $q_i^r = W_Q^r \cdot LN(\tilde{h}_i), \quad k_k^r = W_K^r \cdot LN(\tilde{h}_k), \quad v_k^r = W_V^r \cdot LN(\tilde{h}_k)$ The attention score is distance-biased: $s_{ik}^r = \frac{(q_i^r)^\top k_k^r}{\sqrt{d}} + (W_r^\top \phi_{rbf}(\|\tilde{x}_i - \tilde{x}_k\|_2))$ Normalize within $\mathcal{N}_r(i)$ : $\alpha_{ik}^r = \frac{e^{s_{ik}^r}}{\sum_{k'\in\mathcal{N}_r(i)} e^{s_{ik'}^r}}$

Residue feature and coordinate updates: $m_i^r = \sum_{k\in\mathcal{N}_r(i)} \alpha_{ik}^r v_k^r$

$g_{ik}^r = GLU([q_i^r ; k_k^r ; \phi_{rbf}(\|\tilde{x}_i-\tilde{x}_k\|_2)])$

$\Delta x_i^{rr} = \sum_{k \in \mathcal{N}_r(i)} \alpha_{ik}^r g_{ik}^r \frac{(\tilde{x}_i - \tilde{x}_k)}{\|\tilde{x}_i-\tilde{x}_k\|_2}$

Atom-side projections: $k_j^s = W_K^s \cdot LN(z_j),\quad v_j^s = W_V^s \cdot LN(z_j)$ Attention and normalization (as above), but for residue-to-atom and atom-to-residue: $s_{ij}^{rs} = \frac{(q_i^r)^\top k_j^s}{\sqrt{d}} + (W_s^\top \phi_{rbf}(\|\tilde{x}_i - y_j\|_2))$

$\beta_{ij}^{rs} = \text{softmax}_{j \in \mathcal{N}_s(i)} (s_{ij}^{rs})$

$\beta_{ij}^{sr} = \text{softmax}_{i \in \mathcal{N}_r'(j)} (s_{ij}^{rs})$

Message and coordinate updates (analogous to above): $m_i^{rs} = \sum_{j\in\mathcal{N}_s(i)} \beta_{ij}^{rs} v_j^s$

$\Delta x_i^{rs} = \sum_{j\in\mathcal{N}_s(i)} \beta_{ij}^{rs} g_{ij}^{rs} \frac{(y_j - \tilde{x}_i)}{\|\tilde{x}_i-y_j\|_2}$

$m_j^{sr} = \sum_{i\in\mathcal{N}_r'(j)} \beta_{ij}^{sr} v_i^r$

$\Delta y_j^{sr} = \sum_{i\in\mathcal{N}_r'(j)} \beta_{ij}^{sr} g_{ij}^{sr} \frac{(\tilde{x}_i - y_j)}{\|\tilde{x}_i-y_j\|_2}$

3.4. Feature Aggregation and Coordinate Update

For residues $i$ in the set of masked, non-conserved pocket residues $\overline{M} \cap \tilde{P}$ : $h_i' = h_i + FFN([ m_i^r ; m_i^{rs} ]),\quad \tilde{x}_i' = \tilde{x}_i + \Delta x_i^{rr} + \Delta x_i^{rs}$ For all substrate atoms $j$ : $z_j' = z_j + FFN( m_j^{sr} ),\quad y_j' = y_j + \Delta y_j^{sr}$ All $FFN$ operations are two-layer positionwise MLPs with activation and residual.

4. Parameterization and Implementation

RBA contains the following learnable parameters:

Linear projection matrices for queries, keys, values in both residue and atom streams ( $W_Q^r$ , $W_K^r$ , $W_V^r$ , $W_K^s$ , $W_V^s$ ).
RBF-related learnable projections ( $W_r$ , $W_s$ ).
GLU gating weights for each attention stream.
$FFN$ weights and biases.

No explicit weight sharing occurs between intra-residue and cross-modal modules. Standard PyTorch initialization (e.g., Xavier/kaiming) is used for all layers unless pretrained weights are loaded.

5. Procedural Workflow

RBA is called within each “stage” of EnzyPGM as follows:

For each residue $i$ , intra-residue messages and geometric updates are computed from spatial neighbors $k \in \mathcal{N}_r(i)$ .
Cross-modal messages from neighboring substrate atoms and residues are exchanged.
Gated, RBF-distance-weighted sums produce both feature and (Cα) coordinate updates.
Only pocket residues (masked, non-conserved) are updated to maintain focus on the catalytic environment.
Substrate atom representations are always updated, reflecting reciprocal influence on pose and chemistry.

The overall forward computation sequence is provided in full explicit pseudocode in the original source and is differentiable end-to-end (Lin et al., 27 Jan 2026).

6. Empirical Significance and Ablation Studies

RBA is empirically critical for EnzyPGM’s performance on protein design and pocket modeling. Ablation experiments demonstrate:

w/o intra-residue attention: Protein structure accuracy (pLDDT) drops by 1.24, and pocket RMSD increases by 1.44 Å.
w/o residue–atom attention: pLDDT drop of 2.39, pocket RMSD increase of 1.45 Å.

These results demonstrate that both scales are essential: residue-residue attention ensures geometric stability and local structure; residue–atom attention captures binding specificity (Lin et al., 27 Jan 2026).

7. Advantages, Limitations, and Prospects

RBA’s main advantages are:

Explicitly couples local (intra-pocket) network formation and precise ligand recognition.
Performs simultaneous updating of both features and 3D coordinates in an SE(3)-equivariant context.
Modular focus on masked pocket residues and all substrate atoms enables scalability.
Measurable improvement in binding affinity (Δbinding energy = 0.47 kcal/mol relative to EnzyGen), foldability (pLDDT), and 3D RMSD.

Limitations:

Depends on fixed spatial thresholds to define “pocket” and local neighborhoods.
Only operates on Cα backbone; side-chain conformations and explicit solvent/entropy are not modeled.
Computational cost increases due to dual-stream attention and RBF basis expansions.

A plausible implication is that future RBA variants may incorporate side-chain atoms or adaptive neighborhood selection to further improve catalytic accuracy and designability.

Reference:

EnzyPGM: Pocket-conditioned Generative Model for Substrate-specific Enzyme Design (Lin et al., 27 Jan 2026)

Markdown Report Issue Upgrade to Chat

References (1)

EnzyPGM: Pocket-conditioned Generative Model for Substrate-specific Enzyme Design (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Residue-atom Bi-scale Attention (RBA).

Residue-Atom Bi-scale Attention (RBA)

1. Conceptual Foundation and Goals

2. Input Representations and Neighborhood Construction

3. Mathematical Mechanics of Bi-scale Attention

3.1. Radial Basis Function (RBF) Distance Embedding

3.2. Intra-residue Attention

3.4. Feature Aggregation and Coordinate Update

4. Parameterization and Implementation

5. Procedural Workflow

6. Empirical Significance and Ablation Studies

7. Advantages, Limitations, and Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Residue-Atom Bi-scale Attention (RBA)

1. Conceptual Foundation and Goals

2. Input Representations and Neighborhood Construction

3. Mathematical Mechanics of Bi-scale Attention

3.1. Radial Basis Function (RBF) Distance Embedding

3.2. Intra-residue Attention

3.3. Residue–Atom Cross-modal Attention

3.4. Feature Aggregation and Coordinate Update

4. Parameterization and Implementation

5. Procedural Workflow

6. Empirical Significance and Ablation Studies

7. Advantages, Limitations, and Prospects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research