Timed Graph Relationformer (TGR) Layer

Updated 27 December 2025

Timed Graph Relationformer (TGR) Layer is a neural architecture that processes time-indexed, feature-annotated graphs by integrating local topological context with global set-level and relational information.
It combines multi-head graph attention, DeepSets, and Relation Net outputs via a learned gating mechanism and incorporates Time2Vec temporal encoding to produce permutation-invariant representations.
The TGR layer has been effectively applied to reinforcement learning scenarios like interactive swarm leader identification, demonstrating superior robustness and generalization over baseline GNN approaches.

The Timed Graph Relationformer (TGR) layer is a neural architecture for processing time-indexed, feature-annotated graphs. It is designed to generate informative, permutation-invariant global representations suitable for reinforcement learning with graph-structured observations. The TGR layer was introduced in the context of interactive Swarm Leader Identification (iSLI), where an agent must probe a robotic swarm to infer its leader, but its construction and data flow highlight a general approach to temporal graph representation learning (Bachoumas et al., 20 Dec 2025).

1. Data Flow and Architectural Modules

At each discrete time step $k$ , the TGR layer processes an observation encoded as a directed graph $\hat{\mathcal{G}}[k] = (\hat{H}[k], \hat{S}[k], \hat{R}[k], k)$ , where $\hat{H}[k] \in \mathbb{R}^{(N+1) \times D_n}$ is the node feature matrix (for $N$ swarm agents plus the prober), $\hat{S}[k]$ and $\hat{R}[k] \in \{0,1\}^{(N+1)\times(N+1)}$ are adjacency masks, and $k$ is the current timestep.

The TGR layer consists of the following modules, applied with a specific data flow:

Multi-Head Graph Attention Transformer (GAT): Processes node features and adjacency information to produce updated node embeddings that integrate local topological context and edge weighting, outputting $\hat{H}'[k]$ .
DeepSets (DS) Readout: Computes a permutation-invariant, set-level summary of node features by aggregating transformed node embeddings.
Relation Net (RN) Readout: Aggregates all pairwise node interactions, incorporating both node features and edge attributes for a relational summary.
Gating Fusion: Combines DS and RN outputs via an element-wise, learned gating mechanism.
Time2Vec (T2V) Temporal Encoding: Encodes the absolute timestep $k$ as a high-dimensional periodic/linear feature.

The outputs of the Gating Fusion and T2V components are concatenated to produce the final TGR global representation $\mathbf{g}_{\mathrm{TGR}}[k]$ .

2. Forward Pass and Mathematical Formulation

The TGR layer's forward pass is precisely specified by the following sequence of operations:

Graph Attention Transformer (GAT):

$\hat{H}'[k] = \mathrm{GAT}(\hat{H}[k],\,\hat{S}[k],\,\hat{R}[k])$

For each node and head, query, key, and value projections are computed, with attention coefficients determined by masked and leaky-ReLUed softmax activations incorporating edge weights; outputs across heads are concatenated.

DeepSets Global Read-Out:

$\mathbf{g}_{\mathrm{DS}}[k] = \rho\!\left(\sum_{i=1}^{N+1}\phi(\hat{h}'_i[k])\right)$

where $\phi$ and $\rho$ are MLPs.

Relation Net Global Read-Out:

$\mathbf{g}_{\mathrm{RN}}[k] = \psi\left(\sum_{i=1}^{N+1}\sum_{j=1}^{N+1} \theta(\hat{h}'_i[k]\Vert \hat{h}'_j[k]\Vert e_{i\to j}[k])\right)$

where the edge feature $e_{i\to j}[k]$ captures information such as interaction counts.

Learned Gating Fusion:

$\mathbf{g}_{\mathrm{GR}}[k] = \mathbf{g}_{\mathrm{DS}}[k] \odot \sigma(\mathbf{g}_{\mathrm{RN}}[k])$

with $\sigma$ the elementwise sigmoid.

Time2Vec Temporal Encoding:

$\tau_k = [w_0 k + b_0\; \Vert\; \sin(w_1 k + b_1)\; \Vert \dots \Vert \sin(w_{D_t-1}k + b_{D_t-1})]^\top$

Final Output:

$\mathbf{g}_{\mathrm{TGR}}[k] = \mathbf{g}_{\mathrm{GR}}[k]\; \Vert\; \tau_k$

yielding a vector in $\mathbb{R}^{D_g+D_t}$ .

3. Gating Mechanism for Relational Fusion

The distinctive aspect of the TGR architecture is its gating fusion, which allows dynamic modulation between coarse set-level information (DS) and fine relational cues (RN) at each timestep. Each coordinate $i$ of the DS output is multiplied by a learned sigmoid gate $g_i[k] = \sigma(\mathbf{g}_{\mathrm{RN},i}[k])$ . This enables the RN to selectively amplify or suppress set-based features in response to relational context, such as the concentration of prober-swarm interactions. The gating mechanism is critical for integrating aggregate and relational information adaptively as the probing policy interacts with the swarm (Bachoumas et al., 20 Dec 2025).

4. Integration with Downstream Sequence Modeling and PPO

The output sequence $\{\mathbf{g}_{\mathrm{TGR}}[0], \ldots, \mathbf{g}_{\mathrm{TGR}}[k]\}$ is linearly projected and provided as the input token sequence to an S5 encoder, a structured state-space model. The S5 applies layer normalization, structured state-space updates, and residual connections internally. Its recurrent hidden state $h_k$ summarizes past TGR-derived tokens. Two MLP heads—a policy (actor) and value (critic)—map the S5 encoding $y_k$ to the categorical policy over base velocities and value estimates.

Gradients from the PPO objective—including policy loss, value loss, and entropy bonus—flow through the actor and critic heads, S5, and into the TGR. All components, including GAT, DS, RN, T2V, are trained end-to-end to maximize expected clipped surrogate advantage (Bachoumas et al., 20 Dec 2025).

5. Implementation Details and Hyperparameters

The TGR layer's implementation was found to be robust across a range of graph sizes and swarm speeds. The following hyperparameter settings were used to reproduce results:

Module	Specification	Key Parameters
GAT	Multi-head, edge-weighted	$h=4$ heads, $d=64$ per head
DS (MLPs)	Coarse aggregation	2 hidden layers, 256 units each
RN (MLPs)	Pairwise relational reasoning	2 hidden layers, 256 units each
T2V	Temporal encoding	$D_t=64$ (1 linear + 63 sinusoid)
Output dim	Global, permutation-invariant	$D_g=256$ , $\mathbf{g}_{\mathrm{TGR}}\in\mathbb{R}^{320}$
S5 encoder	State-space sequence	4 layers, 256 hidden units
PPO	RL optimization	clip=0.2, entropy=0.01, lr=3e-4, batch=64, GAE $\lambda=0.95$ , $\gamma=0.99$

Node- and edge-level features are supplied as raw input. Non-specified MLPs use Xavier initialization and LeakyReLU (slope 0.2). Simulation operates at 20 Hz; on-robot at 5 Hz.

6. Application to Swarm Leader Identification

The TGR layer serves as the core graph representation mechanism in the iSLI problem, enabling the learning of adversarially probing policies for leader detection under partially observable and dynamic conditions. It outperforms baseline GNN approaches by successfully fusing topological, interactional, and temporal structure, generalizing across swarm sizes and dynamics, and supporting robust sim-to-real transfer. The architecture is particularly well-suited for reinforcement learning settings where relational and set-aggregate information must be adaptively balanced to support sequential decision making (Bachoumas et al., 20 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

On Swarm Leader Identification using Probing Policies (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Timed Graph Relationformer (TGR) Layer.