Learning pseudo-Boolean k-DNF and Submodular Functions (1208.2294v1)
Abstract: We prove that any submodular function f: {0,1}n -> {0,1,...,k} can be represented as a pseudo-Boolean 2k-DNF formula. Pseudo-Boolean DNFs are a natural generalization of DNF representation for functions with integer range. Each term in such a formula has an associated integral constant. We show that an analog of Hastad's switching lemma holds for pseudo-Boolean k-DNFs if all constants associated with the terms of the formula are bounded. This allows us to generalize Mansour's PAC-learning algorithm for k-DNFs to pseudo-Boolean k-DNFs, and hence gives a PAC-learning algorithm with membership queries under the uniform distribution for submodular functions of the form f:{0,1}n -> {0,1,...,k}. Our algorithm runs in time polynomial in n, k{O(k \log k / \epsilon)}, 1/\epsilon and log(1/\delta) and works even in the agnostic setting. The line of previous work on learning submodular functions [Balcan, Harvey (STOC '11), Gupta, Hardt, Roth, ULLMan (STOC '11), Cheraghchi, Klivans, Kothari, Lee (SODA '12)] implies only n{O(k)} query complexity for learning submodular functions in this setting, for fixed epsilon and delta. Our learning algorithm implies a property tester for submodularity of functions f:{0,1}n -> {0, ..., k} with query complexity polynomial in n for k=O((\log n/ \loglog n){1/2}) and constant proximity parameter \epsilon.
- Sofya Raskhodnikova (29 papers)
- Grigory Yaroslavtsev (26 papers)