VL-PUB Module Overview

Updated 7 February 2026

VL-PUB Module is the publisher-side computation unit in the PubSub-VFL framework that enables privacy-preserving vertical federated learning through split learning.
It employs asynchronous embedding publication, Gaussian differential privacy noise injection, and gradient backpropagation with synchronized parameter updates to handle heterogeneous resources.
Empirical results demonstrate a 2–7× training speedup and high resource utilization while ensuring linear convergence under strict privacy guarantees.

The VL-PUB module is the publisher-side orchestration and computation unit in the PubSub-VFL framework for two-party vertical federated learning (VFL). Engineered for efficient, privacy-preserving split learning between organizations with heterogeneous resources, VL-PUB operationalizes asynchronous embedding publication, local differential privacy (DP) protection, gradient backpropagation, and synchronized parameter updates. Achieving substantial resource utilization and training acceleration, it forms the “passive party” (denoted $P_p$ ) counterpart to the subscriber, functioning as both an independent compute engine and a tightly integrated publisher node in the Publisher/Subscriber (Pub/Sub) layering of PubSub-VFL (Liu et al., 14 Oct 2025).

1. Architectural and Functional Overview

VL-PUB executes on the side of the passive participant $P_p$ in a two-party VFL arrangement. Its duties include:

Sampling local feature minibatches $\{x_i^p\}_{i\in B}$ and computing embeddings $z^p = f_p(x^p;\theta_p)$ using the bottom model.
Applying Gaussian DP noise: $z^p \leftarrow z^p + \xi_{dp}$ , where $\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ .
Publishing these noisy embeddings into an embedding channel $\mathcal{C}_e[b]$ keyed by batch identifier.
On receipt of the corresponding embedding gradient $\nabla_{z^p}\mathcal{L}$ from the subscriber, performing backpropagation to obtain parameter gradients $\nabla_{\theta_p}\mathcal{L}$ and pushing updated parameters to the local Parameter Server (PS).

The module interfaces asynchronously with both the VL-SUB (“active” party) via embedding ( $\mathcal{C}_e$ ) and gradient ( $P_p$ 0) channels, and with the local PS via scheduled aggregation/broadcast. All updates and synchronizations accommodate data and system heterogeneity, bounded staleness, and strict privacy constraints.

Data and Gradient Workflow

VL-PUB samples batch $P_p$ 1.
Computes $P_p$ 2.
Perturbs with Gaussian DP noise: $P_p$ 3.
Publishes $P_p$ 4 to embedding channel $P_p$ 5.
After VL-SUB computes $P_p$ 6, pulls gradient from $P_p$ 7.
Backpropagates through $P_p$ 8, updates $P_p$ 9, pushes to local PS.
At semi-async intervals, PS aggregates local model copies and rebroadcasts fresh parameters.

Diagrammatic flow:

$\{x_i^p\}_{i\in B}$ 0

2. Hierarchical Asynchronous Update Logic

VL-PUB supports a two-level asynchronous paradigm:

Pub/Sub Asynchrony (Cross-Party): Embedding and gradient exchanges between publisher and subscriber occur via decoupled, FIFO Pub/Sub channels.
Semi-Asynchronous PS Updates (Within-Party): Local PS aggregates worker parameter updates every $\{x_i^p\}_{i\in B}$ 1 steps, broadcasting the average to all party workers.

A key pseudo-code summary for the VL-PUB worker loop:

$\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ 4

Update Equations:

$\{x_i^p\}_{i\in B}$ 2

where $\{x_i^p\}_{i\in B}$ 3 (maximum staleness) and $\{x_i^p\}_{i\in B}$ 4 is DP noise.

Staleness Bound:

$\{x_i^p\}_{i\in B}$ 5

3. Heterogeneity-Aware Optimization Problem

VL-PUB is engineered to adaptively optimize performance under resource and data heterogeneity by formalizing and solving a discrete minimax latency problem:

Objective:

$\{x_i^p\}_{i\in B}$ 6

subject to DP privacy, memory, and resource constraints.

$\{x_i^p\}_{i\in B}$ 7: end-to-end per-batch time for subscriber, includes forward and backward passes, top model, and gradient comm.
$\{x_i^p\}_{i\in B}$ 8: end-to-end per-batch time for publisher, includes forward and backward passes, and embedding comm.

Subcomponent Latencies:

$\{x_i^p\}_{i\in B}$ 9

(similar expressions for active side).

Communication times: $z^p = f_p(x^p;\theta_p)$ 0

Privacy Constraint:

$z^p = f_p(x^p;\theta_p)$ 1

for $z^p = f_p(x^p;\theta_p)$ 2-GDP.

Dynamic Programming Solution:

$z^p = f_p(x^p;\theta_p)$ 3

Selects optimal $z^p = f_p(x^p;\theta_p)$ 4 via $z^p = f_p(x^p;\theta_p)$ 5.

4. Convergence and Privacy Guarantees

Convergence of the VL-PUB module (and thus the overall PubSub-VFL framework) is rigorously characterized:

Theorem (5.1 (Liu et al., 14 Oct 2025), under standard convexity, smoothness, bounded variance, staleness, DP):

$z^p = f_p(x^p;\theta_p)$ 6

Thus, the process achieves linear convergence up to a variance floor dependent on the sum of SGD and DP noises.

DP compatibility follows from the independence and zero-mean property of injected GDP noise; the only consequence is a higher asymptotic variance. Convergence and privacy results remain intact for small enough stepsize $z^p = f_p(x^p;\theta_p)$ 7, even under bounded staleness.

5. Empirical Acceleration and Resource Utilization

VL-PUB yields substantial practical speedup and efficient hardware utilization:

Method	Time (s)	Speedup	CPU Util (%)
AVFL-PS	885.01	1.0×	76.2
PubSub-VFL	124.01	7.14×	89.97

Across five benchmark datasets, PubSub-VFL (with VL-PUB) achieved $z^p = f_p(x^p;\theta_p)$ 8 speedup and CPU utilization up to 91.07%, with comparable or improved test accuracy (Liu et al., 14 Oct 2025).

Per-batch computational costs at VL-PUB:

Forward: $z^p = f_p(x^p;\theta_p)$ 9
Backward: similar
Communication: Transmit/receive $z^p \leftarrow z^p + \xi_{dp}$ 0 floats
PS aggregation: $z^p \leftarrow z^p + \xi_{dp}$ 1 every $z^p \leftarrow z^p + \xi_{dp}$ 2 steps

6. Implementation and Integration

Basic Integration:

VL-PUB functions as an asynchronous publisher within any VFL codebase.
Each worker requires:
- Pub/Sub client for embedding ( $z^p \leftarrow z^p + \xi_{dp}$ 3) and gradient ( $z^p \leftarrow z^p + \xi_{dp}$ 4) channels
- Small FIFO buffer ( $z^p \leftarrow z^p + \xi_{dp}$ 5)
- Waiting-deadline ( $z^p \leftarrow z^p + \xi_{dp}$ 6)
- DP noise generator ( $z^p \leftarrow z^p + \xi_{dp}$ 7 calibrated via Eq. (25))
- Semi-async PS-sync interval ( $z^p \leftarrow z^p + \xi_{dp}$ 8)

Hyperparameter Selection:

Synchronous profiling round for $z^p \leftarrow z^p + \xi_{dp}$ 9.
Solve the DP or use provided dynamic programming code for optimal $\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ 0. Defaults: $\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ 1.
Monitor channel latencies; adjust $\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ 2 or buffer sizes as needed.
Re-solve DP if data imbalance emerges.

VL-PUB thereby implements lightweight yet robust asynchronous publication, gradient consumption, and parameter updates with bounded staleness, strict privacy via GDP, and explicit resource heterogeneity adaptation. It achieves notable end-to-end acceleration ( $\xi_{dp} \sim \mathcal{N}(0,\sigma_{dp}^2I)$ 3) and near-optimal hardware efficiency (up to 91.07% CPU utilization) in empirical testing (Liu et al., 14 Oct 2025).

Markdown Report Issue Upgrade to Chat

References (1)

PubSub-VFL: Towards Efficient Two-Party Split Learning in Heterogeneous Environments via Publisher/Subscriber Architecture (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to VL-PUB Module.

VL-PUB Module Overview

1. Architectural and Functional Overview

Data and Gradient Workflow

2. Hierarchical Asynchronous Update Logic

3. Heterogeneity-Aware Optimization Problem

4. Convergence and Privacy Guarantees

5. Empirical Acceleration and Resource Utilization

6. Implementation and Integration

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

VL-PUB Module Overview

1. Architectural and Functional Overview

Data and Gradient Workflow

2. Hierarchical Asynchronous Update Logic

3. Heterogeneity-Aware Optimization Problem

4. Convergence and Privacy Guarantees

5. Empirical Acceleration and Resource Utilization

6. Implementation and Integration

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research