Papers
Topics
Authors
Recent
2000 character limit reached

Gaussian Generator Contract

Updated 10 December 2025
  • Gaussian Generator Contract is a finite-precision specification that guarantees generated samples exactly match the cumulative probabilities of a user-supplied Gaussian CDF.
  • It employs the Knuth-Yao scheme for optimal, unbiased coin-flip mapping, significantly reducing entropy usage compared to traditional methods.
  • The contract provides a robust API with formal guarantees on precision, overflow-avoidance, and reproducibility, making it suitable for high-performance sampling.

A Gaussian Generator Contract, as realized in the universal framework described by "Random Variate Generation with Formal Guarantees" (Saad et al., 17 Jul 2025), is an explicit, finite-precision specification and implementation of a random variate generator targeting the Gaussian distribution. It enforces that generated outputs strictly realize the exact cumulative probabilities prescribed by a user-supplied Gaussian CDF, with guarantees on precision, overflow-avoidance, and entropy-optimality. The contract encapsulates both mathematical formalism and a low-level software API, yielding a reproducible and auditable method for exact random sampling in floating-point, fixed-point, or posits-based systems.

1. Formal Specification of the Target Distribution

The contract is constructed upon a finite-precision CDF program defined over a binary number format B=(n,γB,ϕB)B = (n, \gamma_B, \phi_B), where bit-strings of length nn encode real values in R[0,1]\mathbb{R} \cap [0, 1]. Monotonicity and strict endpoint requirements (F(ϕB(0n))=0F(\phi_B(0^n)) = 0, F(ϕB(1n))=1F(\phi_B(1^n)) = 1) guarantee well-formedness. For the Gaussian case, F(x)F(x) is realized as

F(x)=12erfc(xσ2)F(x) = \frac{1}{2} \, \mathrm{erfc}\left(-\frac{x}{\sigma \sqrt{2}}\right)

with all operations implemented in IEEE-754 binary64, including correct clipping and NaN-handling. This discrete mapping defines a probability mass function on the set of representable floats, fully characterizing the desired generator's output law in terms of CDF differences F(b)F(pred(b))F(b) - F(\text{pred}(b)) for each bit-string bb.

2. Construction of Entropy-Optimal Generator

Given a finite-precision CDF FF, the generator algorithm (denoted "Opt") operates using unbiased coin-flips (RandBit()), mapping entropy directly onto output via the information-theoretically optimal Knuth-Yao scheme:

  • For each prefix bb of length n\leq n, the binary-coded distribution pF(b)p_F(b) is computed as

pF(b)=F(b1nb)F(pred(b0nb))p_F(b) = F(b1^{n - |b|}) - F(\text{pred}(b0^{n - |b|}))

where pF(ε)=1p_F(\varepsilon) = 1 and pFp_F decomposes as pF(b)=pF(b0)+pF(b1)p_F(b) = p_F(b0) + p_F(b1) (see Prop. 4.6).

  • The algorithm iteratively refines the output prefix, querying the leading bits of the relevant probability differences; the search proceeds down an implicit DDG (Discrete Distribution Generation) tree with expected coin-flip cost bounded by entropy H(p)H(p) plus two bits (Theorem 2.1).
  • Fast integer operations extract bitfields corresponding to floating-point subtractions, entirely avoiding arbitrary-precision arithmetic (Prop. 4.8).

The procedure realizes exact sampling for any distribution specified by a numerical CDF, restricted to the precision used in its definition, maintaining strict resource bounds per output sample.

3. Instantiation for the Gaussian Distribution

Within double precision (n=64n = 64), the contract is instantiated by providing:

  • A CDF Program:
    1
    2
    3
    4
    
    double gaussian_cdf(double x, double sigma) {
        double z = x / (sigma * sqrt(2));
        return 0.5 * erfc(-z);
    }
  • Generator Invocation: The macro GENERATE_FROM_CDF(gaussian_cdf, sigma) synthesizes the optimal generator via the above algorithm. Each sample requires only the precision and arithmetic already present in the CDF code.
  • Accuracy Guarantee: The output distribution matches the induced discrete law exactly; there is no added error from the generator itself. The output float xx fulfills

Pr[returnx]=F(x)\Pr[\text{return} \leq x] = F(x)

even for subnormal and boundary values.

4. Contractual API and Operational Semantics

The contract exposes a consistent API—typically in C—enabling direct use in high-performance contexts:

1
double rng_gaussian_from_cdf(struct rng_state *R, double sigma);
Where:

  • R serves as a source of unbiased random bits,
  • sigma is required to be positive,
  • The output xx in binary64 satisfies the exact cumulative probability constraint as per the provided CDF,
  • Entropy utilization is theoretically bounded (54\leq 54 bits per sample for double, Cor. 4.15); buffer and alignment strategies can be tuned further.

Preconditions:

  • Input CDF must be strictly monotonic, return values only in [0,1][0,1], and be robust to NaN inputs.

Postconditions:

  • Each output is distributed precisely according to the finite-precision CDF, with no error introduced in the sampling process.

5. Performance and Entropy Efficiency

Empirical evaluations (Table 5 in (Saad et al., 17 Jul 2025)) demonstrate:

  • Entropy Utilization:
    • The OPT (optimal) sampler uses 25\approx 25 bits per Gaussian sample, with the minimum possible bound m+254m + 2 \approx 54 bits for m=52m=52 mantissa bits in binary64.
    • Standard routines such as Box-Muller or Ziggurat typically consume two uniform samples (each 53\approx 53 bits), totaling $106$ bits—thus the contract achieves more than double the entropy efficiency.
  • Throughput:
    • The C prototype generator achieves 2×1052 \times 10^5 samples/sec on 3 GHz x86_64; with optimized CDF and inlined bit-extraction, rates approach one million samples/sec while retaining exactness.
    • For comparison, GSL's Ziggurat implementation reaches 1.4×1061.4 \times 10^6 samples/sec but does not enforce strict CDF fidelity.
  • Theoretical Bounds:
    • With m=52m=52, the expected coin-flips per sample satisfy E[flips]54E[\text{flips}] \leq 54 (Cor. 4.15), matching information-theoretic lower bounds.

6. Theoretical Guarantees and Correctness Arguments

The generator's soundness and optimality are rigorously established:

  • Correctness:
    • By explicit carry analysis (Props. 3.9, 3.10), the generated samples realize exactly the target probability mass function (Thm 3.14).
  • Optimality:
    • Each generated path matches the DDG tree predicted by Knuth-Yao (Thm 3.17); resource consumption is bounded as O(n)O(n) per sample in both time and space, with all operations confined to machine word sizes.
  • Overflow and Precision:
    • Bit-extraction and arithmetic for CDF differences never exceed the width of an input word (Prop. 4.8), precluding overflows or loss of fidelity.
  • Automation and Universality:
    • The contract and methods are universal; any finite-precision CDF (floating-point, fixed-point, posit) can be used as input, enabling fully automated generator synthesis.

The contract for a Gaussian generator thereby encapsulates a precise, entropy-optimal, and formally auditable mechanism for sampling, distinguished by both its exactness (w.r.t. user-specified CDF) and its efficient resource profile (Saad et al., 17 Jul 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Gaussian Generator Contract.