Papers
Topics
Authors
Recent
Search
2000 character limit reached

Finite-State Dimension in Automata Theory

Updated 8 February 2026
  • Finite-state dimension is a measure that quantifies the asymptotic information density of infinite sequences as perceived by finite automata using block-entropy rates and compressibility methods.
  • It admits multiple equivalent characterizations via finite-state compressors, block entropy, martingale strategies, and automatic Kolmogorov complexity, linking algorithmic information theory with ergodic theory.
  • Extensions to multihead and relative models reveal a nuanced hierarchy of randomness, underpinning applications in fractal geometry and predictive modeling of symbolic data.

Finite-state dimension quantifies the asymptotic information density of infinite sequences as perceived by finite automata. It is the prototypical quantitative effectivization of classical Hausdorff dimension for symbolic data, forming a rigorous bridge between algorithmic information theory, ergodic theory, and automata-theoretic complexity. Finite-state dimension has robust equivalent characterizations in terms of block entropy rates, finite-state compressibility, finite-state gambling strategies (martingales), and automatic Kolmogorov complexity, and admits powerful extensions to multihead and relative (conditional) variants. Its study reveals both the automata-theoretic foundation of Borel normality and a fine-grained hierarchy of "randomness" and compressibility for infinite sequences.

1. Equivalent Characterizations

There are several mathematically equivalent ways to define the finite-state dimension dimFS(X)\dim_{FS}(X) of an infinite sequence XΣωX \in \Sigma^\omega over a finite alphabet Σ\Sigma, all of which yield a value in [0,1][0,1].

Block-Entropy Rate Definition:

Let k1k \geq 1. For aligned blocks of length kk in XX, form the empirical distribution Pk,NP_{k,N} over all wΣkw \in \Sigma^k in the first NN XΣωX \in \Sigma^\omega0-blocks, and define the Shannon entropy XΣωX \in \Sigma^\omega1. The lower block entropy rate is XΣωX \in \Sigma^\omega2. Then

XΣωX \in \Sigma^\omega3

The use of non-aligned (sliding-window) blocks or disjoint blocks yields the same limit (Kozachinskiy et al., 2017, Becher et al., 2024).

Finite-State Compressor (Automatic Kolmogorov Complexity):

An information-lossless finite-state compressor XΣωX \in \Sigma^\omega4 is a finite automaton transducer whose output and terminal state uniquely determine the input. The finite-state dimension is

XΣωX \in \Sigma^\omega5

where the infimum is over all such XΣωX \in \Sigma^\omega6 (Kozachinskiy et al., 2017, Mayordomo, 2022).

Finite-State Gambling (Gales):

A finite-state XΣωX \in \Sigma^\omega7-gale is a martingale process XΣωX \in \Sigma^\omega8 implementable by a finite automaton, such that XΣωX \in \Sigma^\omega9 for all Σ\Sigma0 (binary case). The dimension is

Σ\Sigma1

A strong variant using liminf instead of limsup defines the strong finite-state dimension Σ\Sigma2 (Kozachinskiy et al., 2017, Lutz et al., 2021).

Automatic Kolmogorov Complexity and Superadditivity:

Let Σ\Sigma3 be a rational relation on Σ\Sigma4 defined by a finite automaton; Σ\Sigma5 is the minimal length of a description mapping to Σ\Sigma6 under Σ\Sigma7. Then

Σ\Sigma8

where the infimum is over all such automatic description modes (Kozachinskiy et al., 2017).

2. Foundational Properties and Normality

Borel normality, the property that every possible block of a given length appears with the expected limiting frequency, is the automata-theoretic threshold for maximal finite-state dimension:

Periodic or ultimately periodic sequences have [0,1][0,1]3. Intermediate dimensions [0,1][0,1]4 can be realized by mixing periodic and normal segments, and via explicit constructions such as Liouville numbers with prescribed finite-state dimension (Nandakumar et al., 2012).

For saturated sets [0,1][0,1]5 of all sequences over [0,1][0,1]6 with limiting symbol frequencies [0,1][0,1]7, [0,1][0,1]8, where [0,1][0,1]9 is the Shannon entropy. In such sets, every individual sequence k1k \geq 10 satisfies k1k \geq 11; this is a pointwise, not merely almost-everywhere, property [0703085].

3. Information-Theoretic and Markov Chain Characterizations

Recent developments establish Markov chain-based and information-theoretic perspectives:

Markov Chain Characterization:

For k1k \geq 12, drive every irreducible Markov chain k1k \geq 13 with the bits of k1k \geq 14. Let k1k \geq 15 be the set of limiting empirical edge-state distributions. Then

k1k \geq 16

where k1k \geq 17 is the conditional Kullback-Leibler divergence, and k1k \geq 18 is the stationary distribution of k1k \geq 19. For strong dimension, the roles of infimum and supremum are exchanged (Bienvenu et al., 21 Oct 2025).

This broadens the Schnorr-Stimm correspondence: Borel normality is equivalent to stationarity of all such simulated Markov chains. Finite-state dimension quantifies the degree of statistical divergence from this ideal (Bienvenu et al., 21 Oct 2025).

Gambling and Prediction Equivalence:

The equivalence of block-entropy, finite-state martingales, automatic Kolmogorov complexity, and compressor formulations is established via duality arguments, superadditivity, Kraft-type inequalities, and convexity of Shannon entropy (Kozachinskiy et al., 2017, S, 10 Feb 2025). These connections make explicit the precise sense in which finite automata can exploit regularities for compression, prediction, or statistical bias.

4. Multihead, Multi-bet, and Relative Dimensions

Finite-state dimension generalizes via more complex automata models:

Multihead Finite-State Dimension:

Multihead finite-state gamblers have kk0-head architectures with oblivious, one-way movement. For each kk1, the kk2-head finite-state dimension kk3 is defined via kk4-FSGs (finite-state gamblers), and

kk5

The 1-head case recovers classical kk6, and a strict hierarchy holds: for each kk7, there exist sequences for which kk8. Multihead dimension is stable under finite unions, but fixed-kk9 predimensions are not (Huang et al., 26 Sep 2025, Lutz, 20 Oct 2025).

Multi-bet (Product Gales) Dimension:

Product gales and XX0-bet finite-state gamblers spread bets across XX1 separate accounts at each symbol. Multi-bet finite-state dimension matches classical finite-state dimension, and both are characterized by the limiting sliding (or disjoint) block entropy rate (S, 10 Feb 2025).

Relative and Conditional Dimension:

Relative finite-state dimension XX2 allows the automaton constant look-ahead or oracle access to another sequence XX3. This is precisely characterized via conditional block entropies: XX4 This framework unifies conditional normality and multidimensional selection principles, underpinning symmetries akin to van Lambalgen's theorem in algorithmic randomness (Nandakumar et al., 2023, Shen, 2024).

5. Structural Results, Examples, and Selection Principles

The dimension is robust under standard transformations:

  • Rational translations and scalings (e.g., XX5) preserve XX6 (Nandakumar et al., 2012, Clanin et al., 3 Jun 2025).
  • Polynomial images: Linear rational-coefficient polynomials preserve finite-state dimension, but higher-degree polynomials or real coefficients can alter it arbitrarily (Clanin et al., 3 Jun 2025).
  • Saturated sets: For symbol-frequency-constrained classes XX7, XX8, matching Hausdorff and packing dimensions [0703085].

Selection Principles:

  • Agafonov's theorem and its extensions: For any regular (finite-state) selection rule, the dimension of any subsequence selected from XX9 is preserved up to sharply computable bounds via stationary mass, with equality in the normal/full-dimension case (Bienvenu et al., 21 Oct 2025).
  • Arithmetic progression subsequences: For Pk,NP_{k,N}0 and its Pk,NP_{k,N}1-AP subsequences Pk,NP_{k,N}2,

Pk,NP_{k,N}3

with equality characterizing normality and facilitating strong converses to Wall's theorem (Nandakumar et al., 2023).

Liouville Numbers:

For every rational Pk,NP_{k,N}4 and base Pk,NP_{k,N}5, there exists a Liouville number with Pk,NP_{k,N}6, demonstrating the existence of transcendental numbers with prescribed compressibility profiles (Nandakumar et al., 2012).

6. Connections to Fractal Geometry, Entropy, and Computational Aspects

  • For saturated classes and many invariant subshifts, finite-state dimension coincides with Hausdorff dimension and (in the strong variant) packing dimension, ensuring alignment with geometric fractal measures [0703085].
  • The entropy-rate interpretation of finite-state dimension situates it as a lower asymptotic density, i.e., the minimal compression rate achievable via finite automata, or equivalently, the minimal average uncertainty per symbol as measured by empirical block frequencies (Becher et al., 2024, Lutz et al., 2021).
  • Rauzy's sliding-window predictor mismatch rates Pk,NP_{k,N}7 and their sharp relationship to block entropies provide algorithmic, computable bounds:

Pk,NP_{k,N}8

yielding practical methods for lower and upper bounding Pk,NP_{k,N}9 on observed data (Becher et al., 2024).

7. Point-to-Set Principle, Generalizations, and Modern Directions

Point-to-Set Principle:

Finite-state dimension admits a point-to-set characterization: for wΣkw \in \Sigma^k0, wΣkw \in \Sigma^k1, where wΣkw \in \Sigma^k2 runs over "separator enumerators" densely enumerating points in wΣkw \in \Sigma^k3. This mirrors the classical result for Hausdorff dimension and provides an operational route to assessing dimension through automata-encoded precision information (Mayordomo, 2022).

Generalizations and Further Structure:

  • Multihead and multi-bet extensions allow increasingly rich automata architectures, revealing strict hierarchies and novel forms of stability and instability for unions and transformations (Huang et al., 26 Sep 2025, Lutz, 20 Oct 2025, S, 10 Feb 2025).
  • Weyl's criterion admits an extension: finite-state dimension can be characterized in terms of the infimum of lower average entropies of all weak subsequential limits of empirical measures from Weyl exponential sums, providing a bridge to harmonic analysis and number-theoretic randomness (Lutz et al., 2021).

The landscape of finite-state dimension is characterized by deep equivalences, structural invariance, and quantitative sensitivity to algorithmic and statistical properties of sequences. Its ongoing generalizations—multihead, multi-bet, and relative models—reflect a broadening theoretical interface between automata, information theory, fractal geometry, and the algorithmic foundations of randomness.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Finite-State Dimension.