Papers
Topics
Authors
Recent
Search
2000 character limit reached

Zero-Error Horizon (ZEH)

Updated 24 January 2026
  • Zero-Error Horizon (ZEH) is a metric that defines the maximum scale for guaranteed error-free performance in systems like LLMs, numerical algorithms, and communication channels.
  • It establishes an explicit boundary—the ZEH limiter—beyond which error-free operation fails, offering a clear audit trail for reliability and safety assessments.
  • Empirical analyses reveal a strong correlation between ZEH and accuracy, with optimized computational techniques enabling efficient evaluation across diverse domains.

The Zero-Error Horizon (ZEH) is a rigorous metric that delineates the maximal regime where a model, communication channel, or algorithm performs without a single error under prescribed conditions. The concept captures the precise input size, problem scale, or coding block-length for which total correctness holds, providing an explicit and auditable guarantee of reliability. ZEH is foundational across several domains, including LLM trustworthy evaluation, robust numerical algorithms, and zero-error communication theory, revealing intrinsic capability boundaries and informing practical deployment strategies (Sato, 22 Jan 2026, Battaglia et al., 1 May 2025, 0911.5300).

1. Formal Definitions and Instantiations

ZEH is task- and system-dependent, but its archetypal definition is as follows:

  • Let MM be a fixed system (e.g., a LLM with fixed prompt and decoding).
  • Let BnB_n denote all problem instances of size nn (as appropriate for the domain).
  • Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i is the collection of all instances up to size nn.
  • CC is the set of instances answered (or solved, or transmitted) correctly by MM.

The Zero-Error Horizon is

ZEH(M)=max{n  |  TnC}.\mathrm{ZEH}(M) = \max\left\{ n \;\middle|\; T_n \subseteq C \right\}.

If n=ZEH(M)n^* = \mathrm{ZEH}(M), all instances up to size nn^* are guaranteed error-free; there exists some BnB_n0 with a failure. The first such BnB_n1 is referred to as a ZEH limiter (Sato, 22 Jan 2026).

In communication theory, the zero-error horizon BnB_n2 for a channel BnB_n3 is the minimal block length BnB_n4 so that at least two messages can be transmitted with zero error in BnB_n5 uses:

BnB_n6

where BnB_n7 denotes the classical one-shot zero-error capacity for a given channel (0911.5300).

In robust numerical methods, the error horizon BnB_n8 is the smallest ball around the true solution to which an iterative algorithm can converge in the presence of perturbations, and a zero-error horizon (i.e., BnB_n9) indicates exact recovery under certain corruption models (Battaglia et al., 1 May 2025).

2. ZEH in Trustworthy LLM Evaluation

ZEH for LLMs is operationalized as the largest input size for which a model provides universally correct answers on all instances of a canonical task, unambiguously under fixed prompt and greedy decoding.

Key empirical ZEHs for GPT-5.2 (Sato, 22 Jan 2026):

  • Multiplication: ZEH = 126 (limiter: nn0 incorrectly answered).
  • Parity of Binary Strings: ZEH = 4 (limiter: "11000" misclassified).
  • Balanced Parentheses: ZEH = 10 (limiter: "((((( ))))))" with 11 parens).
  • Graph Chromatic Number: ZEH = 4 (5-vertex graph miscolored).

For Qwen2.5-Instruct, ZEH scales monotonically with model size:

Model Size ZEH (Multiplication) Accuracy (nn1)
0.5B 0 55.0%
1.5B 20 75.9%
3B 15 79.3%
7B 22 93.2%
14B 26 97.1%
32B 33 98.6%
72B 42 98.6%

Prompt-variation experiments yield nn2 variation in ZEH, confirming stability.

ZEH is tightly correlated with accuracy but reveals "holes" (i.e., error outliers) invisible to mean performance metrics, providing concrete counterexamples for auditability and baseline safety. Emergent algorithmic behaviors are mirrored in ZEH growth: small models exhibit unpredictable failures (memory-based), whereas large models show structured errors (e.g., multiplication carry mistakes), with logistic regression quantifying improved carry robustness with increasing size. Spearman correlations quantify the decoupling between rote corpus memorization and algorithmic generalization, with ZEH strongly indicating the latter (Sato, 22 Jan 2026).

3. ZEH in Robust Numerical Linear Algebra

In iterative algorithms for linear equations (e.g., randomized Kaczmarz), the classical error horizon nn3 determines the residual error ball radius around the true solution under corruptions:

nn4

where nn5 is the matrix condition number and nn6 the corruption vector.

Quantile-based variants (qRK, dqRK) provide strict error-horizon reductions. Defining

nn7

where nn8 encapsulate spectral and quantile-related terms, and nn9 is a dense “small” noise component. Zero-error horizon conditions hold when Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i0 and the fraction of sparse corruptions is within quantile exclusion, i.e., Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i1. No analogous strict ZEH exists for classical RK unless there is zero corruption (Battaglia et al., 1 May 2025).

This yields robust convergence against arbitrarily large sparse corruption, with empirical results demonstrating that the error horizon remains small and stable for quantile-based methods but explodes for classical RK as corruption increases.

4. ZEH in Zero-Error Information Theory

In classical and quantum communication, the ZEH (often labeled zero‐error horizon Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i2) encapsulates the minimum block-length needed for nonzero error-free capacity. The underlying machinery is the channel’s confusability graph Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i3, with one-shot zero-error capacity Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i4 (independence number).

Formally,

Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i5

Entanglement- and more generally, non-signalling-assisted schemes can strictly reduce the zero-error horizon:

  • For certain constructions (e.g., Bell–Kochen–Specker channels), classically Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i6, while sharing entanglement yields Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i7.
  • Non-signalling resources can enable Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i8 even where Tn=i=1nBiT_n = \bigcup_{i=1}^n B_i9, as determined by the hypergraph fractional packing condition nn0 (0911.5300).

This demonstrates that ZEH is not intrinsic to the bare channel but contingent on available shared resources (shared randomness, entanglement, NS correlations), providing a unifying framework for resource-oriented separations in zero-error tasks.

5. Algorithmic and Computational Aspects

ZEH measurement commonly involves exhaustive enumeration:

  1. Fix system (e.g., LLM+prompt+decoding), task, and input-size definition.
  2. For nn1:
    • Enumerate all nn2.
    • Apply nn3 to nn4; validate outcome.
    • If an error occurs, nn5 is the ZEH, and the first failing nn6 is the ZEH limiter (Sato, 22 Jan 2026).

For large-scale problems (e.g., LLMs), direct enumeration is computationally prohibitive. Several optimizations are available:

  • Teacher Forcing: Short-circuits token-by-token decoding.
  • Lookahead Batching: Batches instances by size for GPU utilization and early exit on failures.
  • Prompt KV-Cache Prefilling: Re-uses context attention cache across many instances sharing a prompt.
  • Tree-Structured Decoding (FlashTree): Collapses computation along shared-autoregressive answer suffixes.
  • Empirical metrics: FlashTree yields up to nn7 speedup over naive methods for LLM ZEH computation.

6. Implications, Limitations, and Future Directions

ZEH delivers a guaranteed boundary of all-correct performance—within ZEH, no failures occur under fixed settings; beyond it, error is certain. This allows concrete auditing via ZEH limiters and operationalizes warning signals for out-of-horizon input detection (e.g., in safety-critical pipelines prompting fallback to reliable systems or human intervention).

Limitations include:

  • High sensitivity to prompt and context (valid only under fixed settings).
  • Deterministic decoding requirements—stochastic decoding may yield different horizons.
  • Combinatorial explosion for all but toy tasks, necessitating sampling or formal verification for realistic use cases.
  • ZEH fragility (“collapse”) due to single-bug or randomness-induced brittleness.

Research directions target efficient approximate ZEH estimation (statistical/adversarial sampling), formal methods for symbolic guarantees, extension to stateful or interactive systems, and context-dependent ZEH metrics suitable for complex building blocks (multi-step reasoning, programmatic outputs) (Sato, 22 Jan 2026, Battaglia et al., 1 May 2025).

7. Comparative Perspectives and Conceptual Unification

ZEH bridges disparate fields through a common lens of guaranteed error-free range:

  • In LLM evaluation, it complements accuracy by exposing singular critical failures.
  • In robust linear solvers, it informs the maximal tolerable adversarial corruption before breakdown.
  • In channel coding, it quantifies capacity onset under various side-resources.

The unifying principle is the focus on guaranteed total correctness, not expected or averaged performance, yielding critical insights into capability, audit, and reliability boundaries (Sato, 22 Jan 2026, Battaglia et al., 1 May 2025, 0911.5300).


References

  • "Even GPT-5.2 Can't Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs" (Sato, 22 Jan 2026)
  • "Quantile-RK and Double Quantile-RK Error Horizon Analysis" (Battaglia et al., 1 May 2025)
  • "Improving zero-error classical communication with entanglement" (0911.5300)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Zero-Error Horizon (ZEH).