Agnostic learning of disjunctions on symmetric distributions
Abstract
We consider the problem of approximating and learning disjunctions (or equivalently, conjunctions) on symmetric distributions over f0; 1gn. Symmetric distributions are distri- butions whose PDF is invariant under any permutation of the variables. We prove that for every symmetric distribution D, there exists a set of nO(log (1=ε)) functions S, such that for every disjunction c, there is function p, expressible as a linear combination of functions in S, such that p ε-approximates c in ℓ1 distance on D or Ex∼D[jc(x) - p(x)j] ≤ ε. This implies an agnostic learning algorithm for disjunctions on symmetric distributions that runs in time nO(log (1=ε)). The best known previous bound is nO(1=ε4) and follows from approximation of the more general class of halfspaces (Wimmer, 2010). We also show that there exists a symmetric distribution D, such that the minimum degree of a polynomial that 1=3-approximates the disjunction of all n variables in ℓ1 distance on D is ( p n). Therefore the learning result above cannot be achieved via ℓ1-regression with a polynomial basis used in most other agnostic learning algorithms. Our technique also gives a simple proof that for any product distribution D and every disjunction c, there exists a polynomial p of degree O(log (1=ε)) such that p ε-approximates c in ℓ1 distance on D. This was first proved by Blais et al. (2008) via a more involved argument.