Papers
Topics
Authors
Recent
2000 character limit reached

Minimal Maximum Expected Length

Updated 2 January 2026
  • Minimal maximum expected length is an extremal problem that studies expected length functionals in both prefix coding and longest common subsequence settings.
  • It employs key methodologies including Schur-concavity, spectral analysis, and quadratic programming to derive optimal bounds and distributions.
  • The analysis reveals that uniform distributions maximize expected codeword length in prefix codes, while specific non-uniform distributions can minimize expected LCS in random permutations.

Minimal maximum expected length refers to extremal problems involving the expected value of a length functional in probabilistic or information-theoretic settings. Two prominent lines of research address these questions. The first investigates the maximal minimal expected codeword length for prefix codes under variable source distributions, focusing on Schur-concavity and code-tree constructions. The second explores the minimum expected length of the longest common subsequence (LCS) between two i.i.d. random permutations drawn from an optimally chosen distribution, using spectral and combinatorial techniques.

1. Minimum Expected Length in Prefix Coding

Let $\mathcal{P}n$ denote the set of all probability mass functions (PMFs) $(p_1,p_2,\ldots,p_n)$ on $n$ symbols, with $p_i > 0$ and $\sum{i=1}n p_i=1$. For an integer $D \geq 2$, let $\mathcal{L}D(P)$ be the minimum expected codeword length of a $D$-ary prefix code for the discrete memoryless source $P$. For codeword length vector $\ell = (\ell_1,\ldots,\ell_n) \in \mathbb{Z}{\geq 0}n$ satisfying the D-ary Kraft inequality $\sum_{i=1}{n} D{-\ell_i} \leq 1$, the functional

$$L_D(P) = \min_{\ell: \, \sum D{-\ell_i}\leq 1} \sum_{i=1}{n} p_i\,\ell_i$$

gives the minimal expected length over all prefix codes. For each $P$, the minimum is achieved by a Huffman code [1903.03755].

2. Maximal Minimal Expected Length and Its Attaining Distributions

The mapping $L_D(\cdot)$ is Schur-concave: it attains its maximum at the uniform distribution $U_n = (1/n, 1/n, ..., 1/n)$. Its maximal value is determined as follows. Let $m$ be the unique integer such that $Dm \leq n < D{m+1}$.

  • If $n = Dm$, $L_D(U_n) = m$; all codeword lengths are $m$.
  • If $Dm < n < D{m+1}$, $L_D(U_n) = \lceil \log_D n \rceil$; the optimal code uses two codeword lengths, with $Dm-t$ codewords of length $m$ and $t$ codewords of length $m+1$, $t$ chosen to ensure Kraft's inequality is tight.

Consequently,

$$
\max_{P \in \mathcal{P}_n} L_D(P) =
\begin{cases}
m & \text{if } n = Dm \
\lceil \log_D n \rceil & \text{otherwise}
\end{cases}
$$

If $n \neq Dm$, $U_n$ is the unique maximizer; any deviation reduces $L_D(P)$ due to strict Schur-concavity. If $n = Dm$, all $P$ for which the smallest $D$ probabilities sum to at least the largest, i.e., $\sum_{i=n-D+1}{n} p_i \geq p_1$, also attain the bound; these correspond to distributions admitting a full $D$-ary code tree of depth $m$ [1903.03755].

Case Maximizing Distributions Maximum Value
$n \neq Dm$ $U_n$ only $\lceil \log_D n \rceil$
$n = Dm$ $P$ with $\sum_{i=n-D+1}n p_i \geq p_1$ $m$

3. Minimal Expected Length for Random LCS of Permutations

Let $S_n$ be the set of permutations of $[n]$, and $\mathcal{L}(\pi, \sigma)$ denote the length of the longest common subsequence (LCS) of $\pi, \sigma \in S_n$. Given a probability distribution $\mu$ on $S_n$, let $\pi, \sigma$ be i.i.d. from $\mu$. Define

$$
E_{\min}(n) = \min_{\mu} \mathbb{E}_{\pi, \sigma \sim \mu}[\mathcal{L}(\pi, \sigma)]
$$

This corresponds to the minimal possible expected LCS length when the underlying distribution $\mu$ is optimized [1703.07691].

The expectation can be written as $E_{\pi,\sigma \sim \mu}[\mathcal{L}(\pi,\sigma)] = PT L{(n)} P$, where $L{(n)}$ is the matrix with entries $\ell_{ij} = \mathcal{L}(\pi_i,\pi_j)$.

4. Uniform vs. Non-Uniform Distributions in LCS Problems

Contrary to the coding context, the uniform distribution $U = (1/n! ,..., 1/n!)$ does not always minimize $\mathbb{E}[ \mathcal{L}(\pi, \sigma) ]$. For $n \geq 4$, $L{(n)}$ has a strictly negative eigenvalue. Thus, $P_0 = U + c R_1{(n)}$ for a unit-norm eigenvector $R_1{(n)}$ and $c>0$ sufficiently small gives a distribution $P_0$ with $E_{P_0}[\mathcal{L}(\pi,\sigma)] < E_U[\mathcal{L}(\pi,\sigma)]$. For $n=2,3$, the uniform distribution does minimize the expected LCS [1703.07691].

5. Lower Bounds, Techniques, and Conjectures

Using the inequality of Beame–Huynh-Ngoc, for any triple $\pi_1, \pi_2, \pi_3 \in S_n$,

$$
\mathcal{L}(\pi_1, \pi_2) \cdot \mathcal{L}(\pi_2, \pi_3) \cdot \mathcal{L}(\pi_3, \pi_1) \geq n
$$

which, via AM–GM averaging, yields the cubic-root lower bound $E_{\min}(n) \geq n{1/3}$. The Bukh–Zhou conjecture proposes a universal lower bound of $c \sqrt{n}$ for the minimum expected LCS, but only the cubic-root bound is established to date. The main methodologies involve quadratic programming, spectral analysis, eigenvalue interlacing, and combinatorial product inequalities [1703.07691]. When $\mu$ is uniform, $E[\text{LIS}(\sigma)] \sim 2\sqrt{n}$, suggesting but not proving the $\sqrt{n}$ lower bound for minimax LCS.

6. Significance and Implications

The extremal characterization of the minimum expected codeword length guides source coding design, particularly in optimal code assignment problems and information theory. The precise conditions under which uniform or specific non-uniform PMFs maximize this expectation clarify the interplay between source entropy, code structure, and expected description length. In the LCS context, the finding that non-uniform distributions can minimize expectation—contrasting with classical random coding results—has implications for combinatorial optimization and complexity theory. The techniques deployed, particularly spectral and quadratic programming, provide templates for analyzing related symmetric functionals over probability simplices, while combinatorial inequalities like the Beame–Huynh-Ngoc lemma demonstrate the structural richness of LCS problems. These results establish sharp boundaries between achievable extremal values and guide future inquiries toward tighter probabilistic and combinatorial bounds [1903.03755, 1703.07691].

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Minimal Maximum Expected Length.