Papers
Topics
Authors
Recent
2000 character limit reached

MSO k-ary Queries & Extensions

Updated 13 December 2025
  • MSO k-ary queries are defined using monadic second-order logic with k free variables to express relations over words, trees, and other structures.
  • The analysis reveals that plain MSO cannot define k-hashing properties, necessitating minimal extensions like equi-cardinality predicates for exact counting.
  • Parameterization results and automata techniques enable encoding query outputs efficiently, impacting coding theory, transductions, and logical query optimization.

Monadic second-order (MSO) logic provides a canonical formalism for expressing queries and properties over words, trees, and other relational structures. MSO k-ary queries refer to predicates or relations expressible in MSO with kk free variables, mapping structures to sets of k-tuples satisfying a property defined by a formula. These form the basis of a wide range of definability, expressiveness, and model-theoretic phenomena at the interface of logic, automata, and combinatorics, with applications in coding theory, string transductions, and expressive power characterizations. Recent advances delineate the boundaries of such definability, with profound implications both for structural complexity and for the practical design of logical formalisms.

1. Formal Definitions: MSO k-ary Queries and Key Examples

Let Σ\Sigma be a finite alphabet and wΣw \in \Sigma^* a finite word. The MSO structure on ww is defined with domain {1,,w}\{1, \ldots, |w|\}, the natural order <<, and unary predicates Pa(x)P_a(x) indicating letter positions. Monadic second-order logic allows quantification over both individual positions and sets thereof. A kk-ary MSO query is any formula φ(x1,,xk)\varphi(x_1,\ldots,x_k), where x1,,xkx_1,\ldots,x_k are first-order variables interpreted as word positions.

For each ww, the result set is

Qφ(w)={(i1,...,ik)    wφ(i1,,ik)}.Q_\varphi(w) = \{ (i_1, ..., i_k) \;|\; w \models \varphi(i_1,\ldots,i_k) \}.

Such definable relations encompass a wide range of combinatorial properties. Example: for Σ={0,1}\Sigma = \{0,1\}, let η0(x):=P0(x)\eta_0(x) := P_0(x), η1(x):=P1(x)\eta_1(x) := P_1(x). A ternary query φ(i,j,d):=P0(i)P1(j)Even0d(i+1,j1)\varphi(i,j,d) := P_0(i) \wedge P_1(j) \wedge \text{Even}_0^d(i+1,j-1) (where Even0d(i+1,j1)\text{Even}_0^d(i+1,j-1) expresses that the number of 0’s in (i,j)(i,j) is congruent to dd mod $2$) is MSO-definable; its result set size is O(#0(w)#1(w))O(\#0(w)\cdot\#1(w)) (Nguyên et al., 6 Dec 2025).

2. Definability and Limitations: The k-Hashing Problem

Certain kk-ary relations of interest in computer science are not MSO-definable. For integers n1n \geq 1 and 1kb1 \leq k \leq b, let (w0,,wk1)({0,1,,b1}n)k(w_0,\ldots,w_{k-1}) \in (\{0,1,\ldots,b-1\}^n)^k be a kk-tuple of words. The tuple is (b,k)(b,k)-hashed if there exists a coordinate 1n1 \leq \ell \leq n such that all kk words have pairwise distinct symbols at position \ell.

Testing whether a code is kk-hashing, and computing the maximal size of kk-hash codes, arises in combinatorics and information theory. In the MSO framework, words are modeled as paths in the infinite bb-ary tree, with sets X0,,Xk1X_0,\ldots,X_{k-1} encoding the kk words as sets of tree nodes. An "obvious" MSO formula attempts to witness the kk-hash property by existentially guessing a level \ell with kk distinct next-edge moves—corresponding to the kk symbols—but this approach ultimately fails in full generality.

A central theorem establishes that for all nn (finite or infinite words), no MSO formula defines the kk-hashing relation (Costa et al., 16 Sep 2025). The proof utilizes Ehrenfeucht–Fraïssé games for MSO: even over paths differing at a unique coordinate, duplicator strategies exist so that no MSO formula of bounded rank can distinguish tuples (X0,X1,...,Xk1)(X_0,X_1,...,X_{k-1}) and (Y0,X1,...,Xk1)(Y_0,X_1,...,X_{k-1}), with X0Y0X_0 \neq Y_0, thus violating definability by any candidate formula.

3. Overcoming Limitations via Counting Extensions

The inexpressibility of kk-hashing in plain MSO is traced to the inability of MSO to express exact cardinalities of sets in the absence of counting quantifiers. While MSO can existentially assert the presence of paths through a given level, it cannot constrain the number of such intersections to be exactly one per set, nor can it enforce that the selected nodes are all distinct.

Adding an equi-cardinality predicate eqcard(U,V)\mathrm{eqcard}(U, V)—true iff U=V<|U| = |V| < \infty—yields an extension, denoted MSO+eqcard\mathrm{eqcard}. In MSO+eqcard\mathrm{eqcard}, one can define:

  • Singletons: UU is a singleton iff x(xU)eqcard(U,{x})\exists x (x \in U) \wedge \mathrm{eqcard}(U, \{x\}).
  • kk-distinction: i<j¬eqcard(Ui,Uj)\wedge_{i<j} \neg\,\mathrm{eqcard}(U_i, U_j).

Thus, the (b,k)(b,k)-hashing property becomes MSO+eqcard\mathrm{eqcard}-definable by existentially guessing a level DD, picking singleton intersections UiU_i for each path, and requiring all UiU_i to be mutually distinct nodes. This extension suffices, and in fact, exact cardinality, or more generally, the ability to express U=V+c|U| = |V| + c for small cc, is the minimal necessary augmentation (Costa et al., 16 Sep 2025). Counting modulo quantifiers (CMSO), or full Presburger arithmetic, also suffice.

Logic Can Define kk-hashing? Minimal Counting Feature
MSO No None
MSO+eqcard\mathrm{eqcard} Yes Equi-cardinality (exact counting)
CMSO Yes Modulo counting

4. Parameterization and MSO k-ary Query Structure

A complementary direction is provided by the finer reparameterisation theorem for MSO/FO queries on strings (Nguyên et al., 6 Dec 2025). Suppose φ(x)\varphi(\vec{x}) is a kk-ary MSO query over words, and η1(x),...,η(x)\eta_1(x),...,\eta_\ell(x) are unary MSO formulas. If φ(w)=O(η1(w)η(w))|\varphi(w)| = O(|\eta_1(w)| \cdots |\eta_\ell(w)|) for all ww, then each solution i\vec{i} to φ\varphi can be MSO-definably encoded via an \ell-tuple (j1,...,j)(j_1,...,j_\ell) with wηm(jm)w \models \eta_m(j_m), up to O(1)O(1) ambiguity. Formally, an MSO formula ψ(x;y)\psi(\vec{x}; \vec{y}) establishes a total function from Qφ(w)Q_\varphi(w) to tuples of positions witnessing the η\eta’s.

This result leverages automata-theoretic machinery: recognition by a monoid, construction of factorization forests (Simon's theorem), and a "points-to" graph encoding dependencies among tuple components. Pumping arguments and Hall’s theorem underlie the boundedness and surjectivity conditions. A key inferential implication is that "dimension minimization" follows: if an FO string-to-string interpretation of dimension dd yields size O(w)O(|w|^\ell), then one can find a dimension-\ell FO interpretation with the same behavior (Nguyên et al., 6 Dec 2025).

5. Practical and Theoretical Implications

Expressiveness boundaries for MSO k-ary queries have deep repercussions:

  • Coding theory: The inexpressibility of kk-hashing in MSO pinpoints why certain code properties, such as trifference, cannot be fully captured in tree or word MSO, and why exact counting must be imported as a logical primitive (Costa et al., 16 Sep 2025).
  • Transductions and query optimization: The parameterization result clarifies that output tuple structure (for queries and transductions) can, under boundedness hypotheses, be definably funneled through a small number of anchors or "parameter" positions. This has both descriptive and computational ramifications for automata-based transformations and canonical representations (Nguyên et al., 6 Dec 2025).
  • Descriptive complexity and decision procedures: Knowing precisely which kk-ary queries become definable under which logical extensions enables the design of optimal logical fragments for specification, verification, and synthesis systems, and determines the necessity (or redundancy) of counting features.

The phenomena observed for kk-ary queries generalize in several directions:

  • Pairwise distinctness and rainbow witnesses: General kk-ary relations demanding existence of coordinates with kk-wise distinct values evade MSO expressibility by the same mechanism as (b,k)(b,k)-hashing (Costa et al., 16 Sep 2025). These patterns encompass "rainbow" positions for colorings and generalizations of mutual distinctness constraints.
  • Extensions and minimality: MSO extended by equi-cardinality (eqcard\mathrm{eqcard}), CMSO (modulo counting), or full Presburger arithmetic are precisely the logical settings needed. The minimality result—that equi-cardinality is both necessary and sufficient for kk-hashing—sharpens the landscape of definability.

A plausible implication is that results here demarcate the necessary features for any logical framework intended to capture fine-grained combinatorial or coding-theoretic constraints via logical queries, and guide subsequent research on logics for combinatorial structure analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Monadic Second-Order k-ary Queries.