Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees (1304.0730v1)

Published 2 Apr 2013 in cs.LG, cs.CC, and cs.DS

Abstract: We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube ${0,1}n$. Our main result is the following structural theorem: any submodular function is $\epsilon$-close in $\ell_2$ to a real-valued decision tree (DT) of depth $O(1/\epsilon2)$. This immediately implies that any submodular function is $\epsilon$-close to a function of at most $2{O(1/\epsilon2)}$ variables and has a spectral $\ell_1$ norm of $2{O(1/\epsilon2)}$. It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree $O(1/\epsilon2)$ (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank $4/\epsilon2$ and a proof that any rank-$r$ DT can be $\epsilon$-approximated by a DT of depth $\frac{5}{2}(r+\log(1/\epsilon))$. We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time $\tilde{O}(n2) \cdot 2{O(1/\epsilon{4})}$. The best previous algorithm for the problem requires $n{O(1/\epsilon{2})}$ time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings. We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of $2{\Omega(1/\epsilon{2/3})}$ on the complexity of learning monotone submodular functions in any reasonable model; (2) computational lower bound of $n{\Omega(1/\epsilon{2/3})}$ based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.

Citations (30)

Summary

We haven't generated a summary for this paper yet.