Papers
Topics
Authors
Recent
Search
2000 character limit reached

Rejection Sampling is Optimal for Relative Entropy Coding

Published 25 Apr 2026 in cs.IT | (2604.23076v1)

Abstract: In relative entropy coding, a sender aims to design a stochastic code such that, on input $X \sim P_X$, the receiver can generate a sample $Y \sim P_{Y \mid X}$. It is a standard result that (1) this requires at least $I(X; Y)$ bits, (2) the lower bound is achievable within a logarithmic gap, and (3) this gap cannot be reduced in general. The necessity of the gap suggests that the mutual information is not the correct information measure to quantify the rate of relative entropy coding. A potential alternative emerged in the work of Flamich et al. (2025), who proved a tighter lower bound of $I_F(X \to Y)$, a quantity we call the functional information. In this paper, we show that this lower bound is tight by constructing the ring toss code, an encoding method for rejection sampling which uses at most $I_F(X \to Y) + \log e$ bits. This demonstrates that rejection sampling is optimal for relative entropy coding. Our result implies that the classical mutual information lower bound is achievable within $\log(I(X; Y) + 1) + 2.45$ bits in general and within $1.45$ bits for singular channels, which are both the tightest bounds of their kind to date. Moreover, our one-shot result also recovers Sriramu and Wagner's asymptotic results on the second-order redundancy of relative entropy codes.

Summary

  • The paper introduces the ring toss code, which adapts rejection sampling to reach the lower bound in one-shot relative entropy coding.
  • It proves that under bounded density conditions, the expected codeword length is F(X;Y) + log e bits, effectively closing the gap with mutual information bounds.
  • The work establishes both non-asymptotic and asymptotic tight redundancy bounds, highlighting significant implications for stochastic channel simulation and data compression.

Optimality of Rejection Sampling for Relative Entropy Coding

Overview

This paper formalizes and resolves a longstanding question in coding theory and information theory related to the simulation of stochastic channels, specifically the fundamental rate limits of relative entropy coding. It advances the theory by demonstrating that classical rejection sampling, when coupled with their novel “ring toss code,” is optimal for single-use (one-shot) channel simulation in terms of a newly characterized measure, the “functional information.” Key contributions include closing the gap between classical mutual information bounds and what is fundamentally achievable, characterizing conditions in which redundancy vanishes or remains constant, and providing the tightest non-asymptotic and asymptotic bounds to date.

Relative Entropy Coding and Rate Measures

Relative entropy coding addresses the problem of simulating a conditional distribution PYXP_{Y|X}, given a source XPXX \sim P_X, via a code that allows a receiver to generate a correct YY with potentially shared randomness. The classical lower bound for the expected codeword length is given by the mutual information I(X;Y)I(X;Y). Poisson functional representations achieve this bound up to an additive logarithmic term, and prior work has shown that this gap is unavoidable for general channels.

Flamich et al. [flamich2025redundancy] introduced a refined lower bound termed the functional information F(X;Y)F(X;Y), based on the channel simulation divergence, which is always at least as tight as the mutual information lower bound but was previously unachievable except for special cases.

Ring Toss Code and Main Result

The authors introduce the ring toss code, an encoding method adapted from rejection sampling, which utilizes an alternative encoding of the sample index to achieve an expected rate of F(X;Y)+logeF(X;Y) + \log e bits for all channels satisfying a bounded density ratio condition. This result establishes that rejection sampling is rate-optimal for relative entropy coding in the sense of achieving the exact lower bound on expected codeword length up to an additive constant.

The central theorem states that:

  • For random variables (X,Y)PX,Y(X, Y) \sim P_{X,Y} and proposal distribution PYP_Y, with dPYX/dPYdP_{Y|X}/dP_Y bounded by MM, the ring toss code achieves XPXX \sim P_X0 for common randomness XPXX \sim P_X1.
  • For singular channels (where XPXX \sim P_X2 is a function only of XPXX \sim P_X3 almost everywhere), XPXX \sim P_X4, so constant redundancy is achieved over the mutual information bound.

Notably, the ring toss code does not rely on Poisson or greedy Poisson representations, but rather directly encodes the conditional distribution derived from the shared randomness and proposal samples.

Implications and Tight Numerical Bounds

The established rate immediately recovers and strengthens previously known asymptotic results:

  • The classical lower bound gap of XPXX \sim P_X5 bits is improved to XPXX \sim P_X6 bits for general channels, and XPXX \sim P_X7 bits (i.e., XPXX \sim P_X8) for singular channels.
  • For blocklength XPXX \sim P_X9, the asymptotic redundancy per symbol is characterized, with the logarithmic gap coefficient being YY0 for singular channels and YY1 for nonsingular channels.

These provide the tightest known achievable bounds for both one-shot and asymptotic regimes and refine the connection between entropy coding rates, mutual information, and the newly formalized functional information.

Structural and Theoretical Insights

The paper identifies the channel simulation divergence (also known as the functional information divergence) as a fundamental quantity for measuring the cost of simulating stochastic channels in the presence of side or shared randomness. The width function, which underpins this divergence, links the geometry of probability densities with achievable code rates.

For singular channels, which include classical examples such as the binary erasure and continuous uniform additive noise channels, the ring toss code demonstrates constant redundancy. Lemma III.1 provides a full characterization of when redundancy vanishes (YY2) in terms of channel singularity.

Importantly, the ring toss code’s search-based interpretation—searching over YY3 instead of YY4—distinguishes it algorithmically from existing entropy codes and elucidates the relationship between the code structure and the underlying stochastic process.

Future Directions

Practical and theoretical extensions suggested include:

  • Removing or relaxing the boundedness condition on the density ratio YY5 to handle unbounded or heavy-tailed distributions.
  • Determining whether there exist broader channel families, beyond singular channels, for which YY6 is sub-logarithmic or negligible.
  • Developing efficient computational methods for the exact evaluation or approximation of YY7 in high-dimensional or structured probabilistic settings.

Given its relation to channel simulation, lossless and lossy compression, and the strong functional representation lemma, this work lays groundwork for further advances in the optimality of stochastic codes across data compression, machine learning, and information theory.

Conclusion

This paper provides a fundamental advance in channel simulation and coding theory, proving that optimized rejection sampling—via the ring toss code—achieves the best possible rate for relative entropy coding in terms of the newly established functional information. The work resolves an important open problem, delivers the tightest non-asymptotic and asymptotic bounds for channel simulation redundancy, and sets a new theoretical standard for both the analysis and practical construction of entropy codes based on stochastic channel simulation (2604.23076).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.