Generative Semantic Communication via Alternating Dual-Domain Posterior Sampling

Published 18 Apr 2026 in cs.CV, cs.IT, and eess.SP | (2604.16796v1)

Abstract: Generative semantic communication (SemCom) harnesses pretrained generative priors to improve the perceptual quality of wireless image transmission. Existing generative SemCom receivers, however, rely on maximum a posteriori (MAP) estimation, which fundamentally cannot preserve the data distribution and thus limits achievable perceptual quality. Moreover, current diffusion-based approaches using single-domain guidance face significant limitations: latent-domain guidance is sensitive to channel noise, while image-domain guidance inherits decoder bias. Simply combining both domains simultaneously yields an overconfident pseudo-posterior. In this paper, we formulate semantic decoding as a Bayesian inverse problem and prove that posterior sampling achieves optimal perceptual quality by preserving the data distribution. Building on this insight, we propose alternating dual-domain posterior sampling (ADDPS), a diffusion-based SemCom receiver that alternately enforces latent-domain and image-domain consistency during the sampling process. This alternating strategy decomposes joint posterior sampling into simpler subproblems, avoiding gradient conflicts while retaining the complementary strengths of both domains. Experiments on FFHQ demonstrate that the proposed ADDPS achieves superior perceptual quality compared with existing methods.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces ADDPS, a novel method that alternates between latent and image domain guidance to preserve the original data distribution in semantic communications.
It reformulates image reconstruction as a Bayesian inverse problem to overcome limitations of traditional MAP-based approaches.
Empirical results on the FFHQ dataset show that ADDPS outperforms prior methods under severe compression and low SNR, achieving lower FID and improved perceptual metrics.

Generative Semantic Communication via Alternating Dual-Domain Posterior Sampling

Introduction

The paper "Generative Semantic Communication via Alternating Dual-Domain Posterior Sampling" (2604.16796) addresses the persistent limitations in wireless image semantic communication (SemCom)—specifically, the challenge of preserving both semantic fidelity and perceptual quality under extreme channel conditions. Existing generative SemCom receivers, which leverage MAP-based estimators or diffusion guidance in a single domain, fail to preserve the intrinsic data distribution and degrade considerably under severe bandwidth compression and low SNR regimes. This work reformulates the decoding problem as a Bayesian inverse problem and introduces Alternating Dual-Domain Posterior Sampling (ADDPS), a theoretically motivated, diffusion-based method that alternates consistency constraints between latent and image domains during reverse diffusion. The alternating scheme structurally decomposes the intractable joint posterior sampling problem and empirically outperforms prior state-of-the-art under challenging realistic conditions.

Background and Problem Formalization

Semantic communication leverages NN-based encoder-decoder pairs to transmit only salient semantic information over noisy channels, diverging from conventional bit-oriented approaches. For wireless image transmission, the encoder maps the input $\bm{x}$ to a complex channel symbol $\bm{z}$ , which is perturbed by AWGN, yielding receiver input $\hat{\bm{z}}$ . Reconstruction at the receiver is increasingly handled by generative models, notably score-based diffusion models (GDMs), which provide the inductive prior necessary for high perceptual quality.

The Bayesian inverse problem formulation reveals that most existing works pursue only a MAP solution:

$\hat{\bm{x}}_{\mathrm{MAP}} = \arg\max_{\bm{x}}\ p(\bm{x}|\hat{\bm{z}})$

As rigorously shown, MAP estimates cannot, in general, preserve the data distribution—yielding output distributions that contract relative to the original (see Proposition 1 and 2 in the paper). In contrast, true posterior sampling aligns $\hat{\bm{x}} \sim p(\bm{x}|\hat{\bm{z}})$ with the empirical distribution, achieving optimal perceptual quality if the generative prior is well-matched.

Dual-Domain Posterior Sampling

Diffusion models, trained as generative priors, enable posterior sampling via score-based SDEs. In SemCom, the conditional posterior score gradient is decomposed as follows:

$\nabla_{\bm{x}_t} \log p_t(\bm{x}_t|\hat{\bm{z}}) = \nabla_{\bm{x}_t} \log p_t(\bm{x}_t) + \nabla_{\bm{x}_t} \log p_t(\hat{\bm{z}}|\bm{x}_t)$

where the second term is computed by differentiating the negative log-likelihood relating the latent channel signal to the reconstructed image.

However, guidance in the latent (Z) domain is prone to noise amplification at low SNR, while purely image (X) domain guidance is robust but discards channel-specific semantic cues and inherits decoder bias, ultimately reducing diversity and realism. Simultaneously combining both gradients (as in HiFi-DiffCom) leads to a pseudo-posterior that overcounts correlated evidence, resulting in overconfident and sometimes degraded reconstructions. The ill-posed nature of such over-combined guidance gradients is also numerically demonstrated.

Figure 1: Illustration of the ADDPS method, showing the alternating scheduling of reverse diffusion updates with Z-domain and X-domain guidance to leverage complementary strengths and avoid noise amplification.

Alternating Dual-Domain Posterior Sampling (ADDPS)

ADDPS alternates between enforcing consistency with the received latent signal (Z-domain) and the decoder-side image (X-domain) on consecutive diffusion time-steps. At each diffusion stage $t$ , the posterior score gradient is approximated as:

Z-guidance (even $t$ ): $\nabla_{\bm{x}_t} \log p_t(\bm{x}_t|\hat{\bm{z}}) \approx \bm{s}_\omega(\bm{x}_t,t) - \zeta_t \nabla_{\bm{x}_t} \|\hat{\bm{z}}-\mathcal{E}_\theta(\hat{\bm{x}}_{0,t})\|^2$
X-guidance (odd $t$ ): $\bm{z}$ 0

Alternation prevents direct aggregation of statistically dependent evidence, mitigates noise amplification and gradient conflict, and leverages the complementary strengths of Z-domain fidelity and X-domain robustness. The result is a theoretically principled and practically robust scheme for posterior sampling in generative SemCom systems.

Experimental Evaluation

Empirical results are reported on the FFHQ dataset at $\bm{z}$ 1 resolution, with severe bandwidth compression (BCR = 1/192) and SNR = 1 dB. The comparison spans classical DeepJSCC, GAN-inversion baselines, and HiFi-DiffCom with single- and dual-guidance. Principal findings are as follows:

ADDPS with $\bm{z}$ 2 diffusion steps achieves the lowest FID (56.94) and PIEAPP (1.293), and near-best DISTS/LPIPS, demonstrating strong distributional fidelity under extremely challenging noise/compression conditions.
DeepJSCC and DeepJSCC-LPIPS attain higher PSNR/MS-SSIM but fail in FID, indicating over-smoothing and poor perceptual realism.
Ablation studies illustrate that:
- Z-guidance alone is unstable and underperforms.
- X-guidance alone is robust but loses diversity and suffers decoder bias.
- Simultaneous guidance leads to worse FID versus alternation.
- ADDPS (alternating) yields the best perceptual scores with only marginal PSNR reduction.

The results provide empirical confirmation that the alternating dual-domain strategy is necessary for robust, perceptually optimal posterior sampling in the high-compression, low-SNR regime.

Practical and Theoretical Implications

ADDPS establishes a new structural paradigm for generative SemCom receivers, integrating Bayesian inference, deep generative priors, and information-theoretic constraints. Theoretically, the method guarantees closer adherence to the original data distribution, overcoming the perceptual bottleneck of MAP-centric approaches.

Practically, ADDPS operates as a plug-in receiver compatible with fixed encoders/decoders and pretrained generative priors, without necessitating retraining, making it attractive for deployment in edge-intelligence and bandwidth-limited wireless scenarios. Its alternating optimization architecture is extensible to other inverse problems, especially where multi-view or multi-domain constraints are present.

Future Directions

Open research trajectories stemming from this work include:

Learning dynamic or adaptive alternation schedules, possibly integrating uncertainty estimation or channel-aware gating.
Generalizing to non-AWGN channels, MIMO settings, sequential/multimodal semantic sources, and joint source-channel coding with closed-loop feedback.
Extending alternating consensus principles to other inverse problems involving multi-view data or multimodal generative priors.
Practical system integration and protocol design for real-world semantic wireless networks leveraging generative AI.

Conclusion

This paper provides a comprehensive, mathematically principled and empirically validated method for semantic image communication under adverse channel conditions. By formalizing the generative SemCom receiver as a Bayesian inverse problem and developing the ADDPS framework for dual-domain posterior sampling, the authors demonstrate compelling perceptual gains and delineate a future path for both the theory and practice of semantic-aware communication system design.

Markdown Report Issue