Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dyadic Probability Tree: Structure & Applications

Updated 1 February 2026
  • Dyadic probability tree is a combinatorial structure representing dyadic PMFs through full prefix-free code trees, where each probability is a negative power of two.
  • It employs Geometric Huffman Coding to construct optimal dyadic PMFs that effectively minimize KL divergence from capacity-achieving distributions with O(m log m) complexity.
  • The framework underpins modulation and distribution matching applications, mapping uniform bit streams to shaped channel symbols for near-capacity performance.

A dyadic probability tree is a combinatorial structure that encodes dyadic probability mass functions (PMFs) through its correspondence with full prefix-free code trees. Within the domain of discrete memoryless channels (DMCs) and memoryless discrete noiseless channels (DNCs), dyadic probability trees facilitate the generation and manipulation of PMFs that conform to dyadic constraints, i.e., each probability is a negative power of two. The framework directly links the design of such PMFs to prefix-free coding and enables efficient approaches to minimizing information-theoretic divergence from capacity-achieving input distributions. This construction is essential in scenarios where input symbols are mapped from streams of independent, equiprobable bits using modulator architectures.

1. Definition and Structure of Dyadic PMFs

A probability mass function p=(p1,,pm)p=(p_1, \dots, p_m) is classified as dyadic if each pip_i can be expressed as pi=2ip_i = 2^{-\ell_i}, where iN\ell_i \in \mathbb{N} and i=1m2i=1\sum_{i=1}^m 2^{-\ell_i} = 1. There is a canonical bijection between dyadic PMFs and full prefix-free codes on mm symbols. In the associated prefix-free code tree, each symbol ii is assigned a codeword of length i\ell_i, satisfying Kraft’s equality:

i=1m2i=1\sum_{i=1}^m 2^{-\ell_i} = 1

Interpreted probabilistically, traversing the tree until the emission of codeword ii occurs with probability 2i2^{-\ell_i}. Thus, every full prefix tree induces the dyadic PMF pi=2ip_i = 2^{-\ell_i}.

2. Capacity Gap and KL Divergence Minimization

Let C\mathsf{C} denote the channel capacity of a DMC and p=argmaxpI(p)p^* = \arg\max_p \,\mathcal{I}(p) be the unique capacity-achieving PMF. When employing a dyadic PMF pp, the achieved mutual information obeys the relation (see Gallager '68):

I(p)=CD(rr)CD(pp)\mathcal{I}(p) = \mathsf{C} - D(r\Vert r^*) \geq \mathsf{C} - D(p\Vert p^*)

where rr and rr^* are output PMFs, and D(pq)=ipilogpiqiD(p\Vert q) = \sum_i p_i\,\log\frac{p_i}{q_i} denotes Kullback–Leibler divergence. To minimize the loss from non-optimal input distributions, the optimal dyadic PMF solves:

$\min_{\substack{p\text{ dyadic}\p_i=0 \text{ if } p^*_i=0}} D(p \Vert p^*)$

For DNCs, a weighted KL minimization arises:

Hˉ(p)=H(p)ipiwi=CRD(p(p)R)ipiwi\bar{H}(p) = \frac{H(p)}{\sum_i p_i w_i} = \mathsf{C}\,R - \frac{D(p\Vert (p^*)^R)}{\sum_i p_i w_i}

Hence, in both scenarios, optimal dyadic PMFs are characterized by minimizing suitable KL distances relative to (possibly weighted) capacity-achieving PMFs.

3. Geometric Huffman Coding for Dyadic PMFs

Traditional Huffman coding constructs prefix trees by sequentially merging the least probable symbols, minimizing expected codeword length for a given PMF. Geometric Huffman Coding (GHC) modifies the merge rule: instead of summing weights, it employs a geometric strategy. For xm1xmx_{m-1} \geq x_m, GHC merges as follows:

x={xm1,xm14xm 2xm1xm,xm1<4xmx' = \begin{cases} x_{m-1}, & x_{m-1} \geq 4x_m \ 2\sqrt{x_{m-1}x_m}, & x_{m-1} < 4x_m \end{cases}

This procedure, applied recursively within a sorted list and encoded as a prefix tree, yields lengths i\ell_i whose induced PMF pi=2ip_i = 2^{-\ell_i} minimizes D(px)D(p \Vert x) over dyadic PMFs.

Geometric Huffman Coding (GHC) Pseudocode:

  1. Sort x1x2xmx_1 \ge x_2 \ge \dots \ge x_m.
  2. While more than one symbol remains:
    • Let (xm1,xm)(x_{m-1},x_m) be the two smallest.
    • Form xx' via the geometric rule above.
    • Remove xm1,xmx_{m-1},x_m, insert xx', and resort.
  3. The depth of each leaf defines i\ell_i, establishing pi=2ip_i=2^{-\ell_i}.

Inductive proofs establish that the optimal merge requires assigning identical maximal length to (xm1,xm)(x_{m-1},x_m) and replacing them with an effective cost um1+um21\tfrac{u_{m-1}+u_m}{2}-1, where ui=log2xiu_i = -\log_2 x_i.

4. Algorithmic Complexity

Each merge operation entails removal of two minimal elements and insertion of a new value into a sorted list of at most mm elements. Using a priority queue or heap, per-merge complexity is O(logm)O(\log m), and there are m1m-1 merges, yielding overall GHC runtime O(mlogm)O(m\log m). This matches the computational order of classical Huffman coding.

5. Block Coding and Asymptotic Capacity

Dyadic probability trees can be extended to block coding. For blocks of length kk, the joint capacity PMF is p(k)=p××pp^{(k)*}=p^*\times\cdots\times p^*, and GHC is applied to p(k)p^{(k)*} to obtain the dyadic approximation p(k)p^{(k)}. The divergence satisfies:

D(p(k)p(k))log2D(p^{(k)}\Vert p^{(k)*}) \leq \log 2

and thus:

1kD(p(k)p(k))k0\frac{1}{k} D(p^{(k)}\Vert p^{(k)*}) \xrightarrow{k\rightarrow\infty} 0

resulting in

1kI(p(k))C\frac{1}{k}\,\mathcal{I}(p^{(k)}) \longrightarrow \mathsf{C}

as kk \to \infty, guaranteeing the asymptotic capacity-achieving property of the construction.

6. Applications in Modulation and Distribution Matching

In digital modulation schemes where nonuniform input distributions (e.g., shaped QAM or APSK constellations) are desirable, dyadic probability trees provide a tractable means to induce symbol distributions approximating the channel’s capacity-optimizing PMF. The process turns a stream of uniformly distributed bits into appropriately distributed channel symbols. Construction of a full prefix tree, with leaves mapped to constellation points, yields dyadic PMFs that closely approach the theoretical optimum for mutual information or entropy rate. The structural and computational properties of GHC—O(mlogm)O(m\log m) complexity and asymptotic optimality—make dyadic probability trees highly effective for practical distribution matching in coded modulation systems (Böcherer et al., 2010).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dyadic Probability Tree.