Papers
Topics
Authors
Recent
Search
2000 character limit reached

Federated Temporal Graph Clustering

Updated 13 April 2026
  • Federated Temporal Graph Clustering (FTGC) is a decentralized method that clusters dynamic graph data by combining temporal aggregation with federated learning while preserving data privacy.
  • FTGC employs graph neural networks for spatial feature extraction and a temporal window mechanism to capture evolving graph structures in client data.
  • The framework utilizes federated averaging with parameter sparsification and quantization to ensure efficient communication and high clustering performance, as evidenced on datasets like DBLP and School.

Federated Temporal Graph Clustering (FTGC) is a decentralized approach for clustering dynamic graph data distributed across multiple clients, each holding a private sequence of temporal graph snapshots. FTGC addresses the challenges of temporal graph clustering under data privacy constraints, enabling collaborative discovery of evolving graph structures without centralizing raw data. The framework uses graph neural networks (GNNs) with a specialized temporal aggregation mechanism and a federated learning protocol, balancing clustering fidelity, temporal smoothness, communication efficiency, and privacy preservation (Zhou et al., 2024).

1. Formalization and Clustering Objective

FTGC operates over KK clients, each storing a temporal graph sequence:

Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,

with Vt(k)V_t^{(k)} the node set, Et(k)E_t^{(k)} the edge set, and Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d} node features at time tt. The clustering task is to compute, for each tt, a soft assignment Ft(k)RVt(k)×CF_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times C} into CC clusters, co-clustering nodes with strong temporal and spatial connectivity and ensuring cluster assignment smoothness over time.

The global objective aggregates local clustering losses, subject to model consistency enforced via federated aggregation:

min{θk}k=1K1Kk=1KLk(θk;Γk),\min_{\{\theta_k\}_{k=1}^K} \frac{1}{K}\sum_{k=1}^K \mathcal{L}_k(\theta_k; \Gamma_k),

where Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,0 are client-local parameters. The core clustering subproblem per client Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,1 is

Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,2

where Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,3 is the graph Laplacian and Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,4 regulates temporal smoothness.

2. Temporal Aggregation and Embedding Construction

Each client computes node embeddings Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,5 by integrating spatial and temporal information. Spatial aggregation uses a graph convolutional approach:

Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,6

where Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,7 and Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,8 is a nonlinear activation. Temporal aggregation leverages a temporal window of size Γk={Gt(k)=(Vt(k),Et(k),Xt(k))}t=1T,\Gamma_k = \bigl\{\,G_t^{(k)}=(V_t^{(k)},E_t^{(k)},X_t^{(k)})\bigr\}_{t=1}^T,9:

Vt(k)V_t^{(k)}0

with learnable attention weights Vt(k)V_t^{(k)}1 (softmax-normalized) and per-offset matrices Vt(k)V_t^{(k)}2. The final temporal-spatial node embedding is

Vt(k)V_t^{(k)}3

This mechanism captures both local graph structure and its temporal evolution, enabling the model to learn temporally coherent cluster representations.

3. Federated Optimization and Training Process

FTGC employs a federated averaging (FedAvg) protocol augmented with model update compression for scalable and communication-efficient training. The training proceeds over Vt(k)V_t^{(k)}4 rounds:

  1. Server broadcasts global model Vt(k)V_t^{(k)}5.
  2. Each client (in parallel):
    • Receives Vt(k)V_t^{(k)}6, initializes Vt(k)V_t^{(k)}7.
    • Performs Vt(k)V_t^{(k)}8 local epochs: computes temporal embeddings Vt(k)V_t^{(k)}9 for all Et(k)E_t^{(k)}0, evaluates and optimizes local loss Et(k)E_t^{(k)}1.
    • Computes update Et(k)E_t^{(k)}2, sparsifies to top Et(k)E_t^{(k)}3 entries (Et(k)E_t^{(k)}4), quantizes (Et(k)E_t^{(k)}5), and uploads Et(k)E_t^{(k)}6.
  3. Server aggregates updates:

Et(k)E_t^{(k)}7

Raw graph data Et(k)E_t^{(k)}8 and features Et(k)E_t^{(k)}9 remain on client devices, ensuring privacy at all stages.

4. Loss Function and Regularization

The per-client loss optimized during local training is composed of a clustering term and a temporal smoothness regularizer:

Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}0

Optionally, an Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}1 penalty may be applied to Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}2:

Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}3

Global optimization minimizes the average total loss Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}4 via the federated loop.

5. Experimental Protocol and Performance

Experiments are conducted on a range of real-world temporal graph datasets partitioned across Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}5 clients:

  • DBLP (co-author network)
  • Brain (functional connectivity)
  • Patent (citation network)
  • School (contact network)

Key experimental hyperparameters include temporal window Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}6, cluster count Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}7 (dataset-dependent), local epochs Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}8, rounds Xt(k)RVt(k)×dX_t^{(k)} \in \mathbb{R}^{|V_t^{(k)}|\times d}9, learning rate tt0, and compression sparsity tt1. Evaluation metrics encompass Clustering Accuracy (ACC), Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and F1-score (F1).

Dataset ACC [%] NMI [%] ARI [%] F1 [%]
DBLP 49.50 38.00 23.50 46.00
Brain 45.00 51.00 31.00 45.00
Patent 51.00 26.00 19.50 39.50
School 99.80 99.50 99.40 99.80

FTGC (with tt2 clients) consistently matches or outperforms centralized methods such as TGC and TREND, without centralizing raw data.

6. Communication Efficiency and Privacy Protection

Communication overhead is minimized via:

  • Transmission of only parameter deltas (tt3) instead of full model weights
  • Sparsification to transmit only the top tt4 of gradient entries (e.g., tt5)
  • Quantization tt6 to 8/16-bit precision

Clients perform multiple local updates before transmitting, reducing synchronization frequency. Data privacy is maintained since neither graph structures tt7 nor node features tt8 are uploaded. Additional protections, such as secure aggregation or differential privacy noise addition to tt9, can further enhance privacy properties as needed.


FTGC establishes a robust framework for federated clustering of dynamic graphs, balancing synchronization efficiency, privacy, and clustering quality in a decentralized setting (Zhou et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Federated Temporal Graph Clustering (FTGC).