On the hardness of learning sparse parities

Arnab Bhattacharyya; Ameet Gadekar; Suprovat Ghoshal; Rishi Saket

doi:10.4230/LIPIcs.ESA.2016.11

ESA 2016

Conference paper

01 Aug 2016

On the hardness of learning sparse parities

View publication

Abstract

This work investigates the hardness of computing sparse solutions to systems of linear equations over F2. Consider the k-EVENSET problem: given a homogeneous system of linear equations over F2 on n variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(nk/2)-time algorithm for it, establishing fixed parameter intractability for k-EVENSET has been a notorious open problem. Towards this goal, we show that unless k-CLIQUE can be solved in no(k) time, k-EVENSET has no polynomial time algorithm when k = ω(log2 n). Our work also shows that the non-homogeneous generalization of the problem - which we call k-VECTORSUM - is W[1]-hard on instances where the number of equations is O(k log n), improving on previous reductions which produced Ω(n) equations. We use the hardness of k-VECTORSUM as a starting point to prove the result for k-EVENSET, and additionally strengthen the former to show the hardness of approximately learning k-juntas. In particular, we prove that given a system of O(exp(O(k)) · logn) linear equations, it is W[1]-hard to decide if there is a k-sparse linear form satisfying all the equations or any function on at most k-variables (a k-junta) satisfies at most (1/2 + ϵ)-fraction of the equations, for any constant ϵ > 0. In the setting of computational learning, this shows hardness of approximate non-proper learning of k-parities. In a similar vein, we use the hardness of k-EVENSET to show that that for any constant d, unless k-CLIQUE can be solved in no(k) time, there is no poly(m, n) · 2o(√k) time algorithm to decide whether a given set of m points in F2n satisfies: (i) there exists a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the points, or (ii) any non-trivial degree d polynomial P supported on at most k variables evaluates to zero on ≈ PrF2n[P(z) = 0] fraction of the points i.e., P is fooled by the set of points. Lastly, we study the approximation in the sparsity of the solution. Let the GAP-k-VECTORSUM problem be: given an instance of k-VECTORSUM of size n, decide if there exist a k-sparse solution, or every solution is of sparsity at least k′ = (1 + δ0)k. Assuming the Exponential Time Hypothesis, we show that for some constants c0, δ0 ≥ 0 there is no poly(n) time algorithm for GAP-k-VECTORSUM when k = ω((loglogn)co).

Paper