Papers
Topics
Authors
Recent
Search
2000 character limit reached

An Optimal Sauer Lemma Over $k$-ary Alphabets

Published 14 Apr 2026 in cs.LG, math.CO, and stat.ML | (2604.12952v1)

Abstract: The Sauer-Shelah-Perles Lemma is a cornerstone of combinatorics and learning theory, bounding the size of a binary hypothesis class in terms of its Vapnik-Chervonenkis (VC) dimension. For classes of functions over a $k$-ary alphabet, namely the multiclass setting, the Natarajan dimension has long served as an analogue of VC dimension, yet the corresponding Sauer-type bounds are suboptimal for alphabet sizes $k>2$. In this work, we establish a sharp Sauer inequality for multiclass and list prediction. Our bound is expressed in terms of the Daniely--Shalev-Shwartz (DS) dimension, and more generally with its extension, the list-DS dimension -- the combinatorial parameters that characterize multiclass and list PAC learnability. Our bound is tight for every alphabet size $k$, list size $\ell$, and dimension value, replacing the exponential dependence on $\ell$ in the Natarajan-based bound by the optimal polynomial dependence, and improving the dependence on $k$ as well. Our proof uses the polynomial method. In contrast to the classical VC case, where several direct combinatorial proofs are known, we are not aware of any purely combinatorial proof in the DS setting. This motivates several directions for future research, which are discussed in the paper. As consequences, we obtain improved sample complexity upper bounds for list PAC learning and for uniform convergence of list predictors, sharpening the recent results of Charikar et al.~(STOC~2023), Hanneke et al.~(COLT~2024), and Brukhim et al.~(NeurIPS~2024).

Summary

  • The paper introduces an optimal Sauer Lemma for k-ary alphabets, establishing a tight upper bound via the DS dimension.
  • It employs the polynomial method to replace exponential dependencies on the list size with polynomial ones, enhancing sample complexity in multiclass and list learning.
  • The results deepen our understanding of multiclass hypothesis structures and highlight open questions regarding a purely combinatorial proof.

Optimal Sauer-Shelah-Perles Inequality for kk-ary Alphabets

Introduction and Motivation

The paper "An Optimal Sauer Lemma Over kk-ary Alphabets" (2604.12952) addresses the classical problem of bounding the size of hypothesis classes in learning theory, generalizing the celebrated Sauer-Shelah-Perles Lemma from the binary setting to multiclass scenarios, i.e., classes of functions [k]n[k]^n. While the binary case is governed by the VC dimension, the multiclass setting lacks a precise analog, with prior bounds (notably those based on the Natarajan dimension) exhibiting suboptimal dependencies on alphabet and list sizes. The work establishes a tight combinatorial inequality for multiclass and list learning via the Daniely–Shalev-Shwartz (DS) dimension and its list extension, resolving longstanding deficiencies in the literature.

Classical and Multiclass Sauer Bounds

The classical Sauer-Shelah-Perles Lemma bounds the size of a binary hypothesis class H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n of VC dimension dd: ∣H∣≤∑i=0d(ni)|\mathcal{H}| \leq \sum_{i=0}^{d} \binom{n}{i} This underpins PAC learnability and uniform convergence for binary classification. For multiclass settings, Natarajan's dimension was proposed, but its associated Sauer-type inequality

∣H∣≲ℓn−dndk(ℓ+1)d|\mathcal{H}| \lesssim \ell^{n-d} n^d k^{(\ell+1)d}

(where â„“\ell is a list size parameter) is only tight for k=2k=2, and fails to capture optimal growth for larger kk and kk0. The dependence on kk1 is worst-case exponential, and the dependence on kk2 is unnecessarily pessimistic.

DS Dimension and Sharp Sauer-Type Bound

Recent advances show that the Daniely–Shalev-Shwartz (DS) dimension (kk3-DS dimension when list prediction is considered) precisely characterizes multiclass and list learnability. The paper introduces the notion of kk4-pseudo-cubes, generalizing kk5-cubes, and sets the kk6-DS dimension as the maximal cardinality of subsets shattered by kk7-pseudo-cubes. The main theorem establishes a tight upper bound: kk8 where kk9 is the [k]n[k]^n0-DS dimension. This result aligns with extremal constructions and is tight for all relevant parameters, correcting the exponential dependence in [k]n[k]^n1 from prior art to a polynomial, and optimizing also the [k]n[k]^n2 dependence.

Proof Technique and Structural Observations

The argument is rooted in the polynomial method, constructing suitable vector spaces and sets of indicator functions and monomials such that the dimension counting yields the claimed combinatorial bound. Notably, this proof is algebraic, diverging from the rich family of combinatorial proofs known for the binary Sauer Lemma. The lack of a purely combinatorial proof in the DS setting is identified as an important open problem, with implications for sample complexity in multiclass PAC learning.

The paper also details connections between pseudo-cubes, bipartite graphs, and classical Turán-type extremal problems, particularly highlighting gaps in existing Natarajan-based bounds through explicit examples.

Applications: PAC Learning and Uniform Convergence

The sharpened DS-based combinatorial inequality yields strong quantitative consequences in learning theory:

  • List PAC Learning: The sample complexity of [k]n[k]^n3-list PAC learning for concept classes [k]n[k]^n4 of finite [k]n[k]^n5-DS dimension [k]n[k]^n6 is improved to

[k]n[k]^n7

This removes any dependence on the alphabet size [k]n[k]^n8, and polynomializes the list size [k]n[k]^n9. Previous bounds, e.g., those based on Natarajan dimension, scaled as H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n0 [charikar2023characterization], and H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n1 [brukhim2024multiclass], so this represents a significant improvement.

  • List Uniform Convergence: The sample complexity for uniform convergence in classes of H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n2-list predictors with H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n3-DS dimension H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n4 is improved to

H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n5

Again, prior work incurred quadratic dependence on H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n6.

These results sharpen the learning-theoretic guarantees for multiclass and list learning, and are consequential for applications such as recommendation systems and top-H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n7 loss classification, where outputting lists of candidates is essential.

Combinatorial and Algebraic Open Directions

The paper explores relationships between the Natarajan and DS dimensions, maximum classes for DS dimension, and the persistence of structural richness analogous to VC-maximum classes. The authors emphasize the import of developing combinatorial proofs for the DS Sauer Lemma, which may lead to further reductions in sample complexity, and deeper understanding of extremal multiclass classes.

Additionally, connections with algebraic approaches to Sauer-type bounds for various combinatorial dimensions (such as Recursive Teaching dimension and Graph dimension) are discussed, suggesting avenues for unification and deeper analysis.

Implications and Future Perspectives

The optimal Sauer-type bound for H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n8-ary alphabets via the DS dimension enables:

  • Tighter theoretical analysis for multiclass PAC and list learning, directly influencing statistical learning theory and practical algorithm design.
  • Reduction in sample complexity for high-cardinality multiclass and top-H⊆{0,1}n\mathcal{H} \subseteq \{0,1\}^n9 settings, removing unnecessary exponential penalties and mitigating the curse of dimensionality.
  • Enhanced understanding of extremal hypothesis class structure, which may inform approaches to sample compression, boosting, and data-dependent learning.
  • Potential further improvements contingent on combinatorial insights, particularly for closing the gap between upper and lower sample complexity bounds in multiclass PAC learning.

The polynomial method's success here reaffirms the efficacy of algebraic combinatorics in learning theory, but the pursuit of combinatorial proof techniques remains both theoretically significant and practically promising.

Conclusion

This work fundamentally refines and optimizes Sauer-type bounds for multiclass and list prediction problems, transitioning from the Natarajan dimension to the DS dimension as the governing combinatorial parameter. The resulting inequality is tight, yields improved sample complexity and uniform convergence rates, and has broad ramifications for both machine learning and extremal combinatorics. The absence of an explicit combinatorial proof in the DS setting, and the structural properties of DS-maximum classes, are highlighted as compelling directions for further research.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 2 likes about this paper.