Document

**Published in:** LIPIcs, Volume 284, 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023)

In recent years the framework of learning from label proportions (LLP) has been gaining importance in machine learning. In this setting, the training examples are aggregated into subsets or bags and only the average label per bag is available for learning an example-level predictor. This generalizes traditional PAC learning which is the special case of unit-sized bags. The computational learning aspects of LLP were studied in recent works [R. Saket, 2021; R. Saket, 2022] which showed algorithms and hardness for learning halfspaces in the LLP setting. In this work we focus on the intractability of LLP learning Boolean functions. Our first result shows that given a collection of bags of size at most 2 which are consistent with an OR function, it is NP-hard to find a CNF of constantly many clauses which satisfies any constant-fraction of the bags. This is in contrast with the work of [R. Saket, 2021] which gave a (2/5)-approximation for learning ORs using a halfspace. Thus, our result provides a separation between constant clause CNFs and halfspaces as hypotheses for LLP learning ORs.
Next, we prove the hardness of satisfying more than 1/2 + o(1) fraction of such bags using a t-DNF (i.e. DNF where each term has ≤ t literals) for any constant t. In usual PAC learning such a hardness was known [S. Khot and R. Saket, 2008] only for learning noisy ORs. We also study the learnability of parities and show that it is NP-hard to satisfy more than (q/2^{q-1} + o(1))-fraction of q-sized bags which are consistent with a parity using a parity, while a random parity based algorithm achieves a (1/2^{q-2})-approximation.

Venkatesan Guruswami and Rishi Saket. Hardness of Learning Boolean Functions from Label Proportions. In 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 284, pp. 37:1-37:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{guruswami_et_al:LIPIcs.FSTTCS.2023.37, author = {Guruswami, Venkatesan and Saket, Rishi}, title = {{Hardness of Learning Boolean Functions from Label Proportions}}, booktitle = {43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023)}, pages = {37:1--37:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-304-1}, ISSN = {1868-8969}, year = {2023}, volume = {284}, editor = {Bouyer, Patricia and Srinivasan, Srikanth}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2023.37}, URN = {urn:nbn:de:0030-drops-194106}, doi = {10.4230/LIPIcs.FSTTCS.2023.37}, annote = {Keywords: Learning from label proportions, Computational learning, Hardness, Boolean functions} }

Document

**Published in:** LIPIcs, Volume 93, 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)

This paper studies the stochastic variant of the classical k-TSP problem where rewards at the vertices are independent random variables which are instantiated upon the tour's visit. The objective is to minimize the expected length of a tour that collects reward at least k. The solution is a policy describing the tour which may (adaptive) or may not (non-adaptive) depend on the observed rewards.
Our work presents an adaptive O(log k)-approximation algorithm for Stochastic k-TSP, along with a non-adaptive O(log^2 k)-approximation algorithm which also upper bounds the adaptivity gap by O(log^2 k). We also show that the adaptivity gap of Stochastic k-TSP is at least e, even in the special case of stochastic knapsack cover.

Alina Ene, Viswanath Nagarajan, and Rishi Saket. Approximation Algorithms for Stochastic k-TSP. In 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 93, pp. 27:1-27:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{ene_et_al:LIPIcs.FSTTCS.2017.27, author = {Ene, Alina and Nagarajan, Viswanath and Saket, Rishi}, title = {{Approximation Algorithms for Stochastic k-TSP}}, booktitle = {37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)}, pages = {27:1--27:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-055-2}, ISSN = {1868-8969}, year = {2018}, volume = {93}, editor = {Lokam, Satya and Ramanujam, R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2017.27}, URN = {urn:nbn:de:0030-drops-83910}, doi = {10.4230/LIPIcs.FSTTCS.2017.27}, annote = {Keywords: Stochastic TSP, algorithms, approximation, adaptivity gap} }

Document

**Published in:** LIPIcs, Volume 93, 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)

A hypergraph is k-rainbow colorable if there exists a vertex coloring using k colors such that each hyperedge has all the k colors. Unlike usual hypergraph coloring, rainbow coloring becomes harder as the number of colors increases. This work studies the rainbow colorability of hypergraphs which are guaranteed to be nearly balanced rainbow colorable. Specifically, we show that for any Q,k >= 2 and \ell <= k/2, given a Qk-uniform hypergraph which admits a k-rainbow coloring satisfying:
- in each hyperedge e, for some \ell_e <= \ell all but 2\ell_e colors occur exactly Q times and the rest (Q +/- 1) times,
it is NP-hard to compute an independent set of (1 - (\ell+1)/k + \eps)-fraction of vertices, for any constant \eps > 0. In particular, this implies the hardness of even (k/\ell)-rainbow coloring such hypergraphs.
The result is based on a novel long code PCP test that ensures the strong balancedness property desired of the k-rainbow coloring in the completeness case. The soundness analysis relies on a mixing bound based on uniform reverse hypercontractivity due to Mossel, Oleszkiewicz, and Sen, which was also used in earlier proofs of the hardness of \omega(1)-coloring 2-colorable 4-uniform hypergraphs due to Saket, and k-rainbow colorable 2k-uniform hypergraphs due to Guruswami and Lee.

Venkatesan Guruswami and Rishi Saket. Hardness of Rainbow Coloring Hypergraphs. In 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 93, pp. 33:1-33:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{guruswami_et_al:LIPIcs.FSTTCS.2017.33, author = {Guruswami, Venkatesan and Saket, Rishi}, title = {{Hardness of Rainbow Coloring Hypergraphs}}, booktitle = {37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)}, pages = {33:1--33:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-055-2}, ISSN = {1868-8969}, year = {2018}, volume = {93}, editor = {Lokam, Satya and Ramanujam, R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2017.33}, URN = {urn:nbn:de:0030-drops-83905}, doi = {10.4230/LIPIcs.FSTTCS.2017.33}, annote = {Keywords: Fourier analysis of Boolean functions, hypergraph coloring, Inapproximability, Label Cover, PCP} }

Document

**Published in:** LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

This work investigates the hardness of computing sparse solutions to systems of linear equations over F_2. Consider the k-EventSet problem: given a homogeneous system of linear equations over $\F_2$ on $n$ variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(n^{k/2})-time algorithm for it, establishing fixed parameter intractability for k-EventSet has been a notorious open problem. Towards this goal, we show that unless \kclq can be solved in n^{o(k)} time, k-EventSet has no polynomial time algorithm when k = omega(log^2(n)).
Our work also shows that the non-homogeneous generalization of the problem - which we call k-VectorSum - is W[1]-hard on instances where the number of equations is O(k*log(n)), improving on previous reductions which produced Omega(n) equations. We use the hardness of k-VectorSum as a starting point to prove the result for k-EventSet, and additionally strengthen the former to show the hardness of approximately learning k-juntas. In particular, we prove that given a system of O(exp(O(k))*log(n)) linear equations, it is W[1]-hard to decide if there is a k-sparse linear form satisfying all the equations or any function on at most k-variables (a k-junta) satisfies at most (1/2 + epsilon)-fraction of the equations, for any constant epsilon > 0. In the setting of computational learning, this shows hardness of approximate non-proper learning of k-parities.
In a similar vein, we use the hardness of k-EventSet to show that that for any constant d, unless k-Clique can be solved in n^{o(k)} time, there is no poly(m,n)*2^{o(sqrt{k})} time algorithm to decide whether a given set of $m$ points in F_2^n satisfies: (i) there exists a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the points, or (ii) any non-trivial degree d polynomial P supported on at most k variables evaluates to zero on approx Pr_{F_2^n}[P({z}) = 0] fraction of the points i.e., P is fooled by the set of points.
Lastly, we study the approximation in the sparsity of the solution. Let the Gap-k-VectorSum problem be: given an instance of k-VectorSum of size n, decide if there exist a k-sparse solution, or every solution is of sparsity at least k' = (1+delta_0)k. Assuming the Exponential Time Hypothesis, we show that for some constants c_0, delta_0 > 0 there is no poly(n) time algorithm for Gap-k-VectorSum when k = omega((log(log( n)))^{c_0}).

Arnab Bhattacharyya, Ameet Gadekar, Suprovat Ghoshal, and Rishi Saket. On the Hardness of Learning Sparse Parities. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 11:1-11:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{bhattacharyya_et_al:LIPIcs.ESA.2016.11, author = {Bhattacharyya, Arnab and Gadekar, Ameet and Ghoshal, Suprovat and Saket, Rishi}, title = {{On the Hardness of Learning Sparse Parities}}, booktitle = {24th Annual European Symposium on Algorithms (ESA 2016)}, pages = {11:1--11:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-015-6}, ISSN = {1868-8969}, year = {2016}, volume = {57}, editor = {Sankowski, Piotr and Zaroliagis, Christos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.11}, URN = {urn:nbn:de:0030-drops-63628}, doi = {10.4230/LIPIcs.ESA.2016.11}, annote = {Keywords: Fixed Parameter Tractable, Juntas, Minimum Distance of Code, Psuedorandom Generators} }

Document

**Published in:** LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

We study the natural problem of estimating the expansion of subsets of vertices on one side of a bipartite graph. More precisely, given a bipartite graph G(U,V,E) and a parameter beta, the goal is to find a subset V' subseteq V containing beta fraction of the vertices of V which minimizes the size of N(V'), the neighborhood of V'. This problem, which we call Bipartite Expansion, is a special case of submodular minimization subject to a cardinality constraint, and is also related to other problems in graph partitioning and expansion. Previous to this work, there was no hardness of approximation known for Bipartite Expansion.
In this paper we show the following strong inapproximability for Bipartite Expansion: for any constants tau, gamma > 0
there is no algorithm which, given a constant beta > 0 and a bipartite graph G(U,V,E), runs in polynomial time and decides whether
- (YES case) There is a subset S^* subseteq V s.t. |S^*| >= beta*|V| satisfying |N(S^*)| <= gamma |U|, or
- (NO case) Any subset S subseteq V s.t. |S| >= tau*beta*|V| satisfies |N(S)| >= (1 - gamma)|U|, unless
NP subseteq intersect_{epsilon > 0}{DTIME}(2^{n^epsi;on}) i.e. NP has subexponential time algorithms.
We note that our hardness result stated above is a vertex expansion analogue of the Small Set (Edge) Expansion Conjecture of
Raghavendra and Steurer 2010.

Subhash Khot and Rishi Saket. Hardness of Bipartite Expansion. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 55:1-55:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{khot_et_al:LIPIcs.ESA.2016.55, author = {Khot, Subhash and Saket, Rishi}, title = {{Hardness of Bipartite Expansion}}, booktitle = {24th Annual European Symposium on Algorithms (ESA 2016)}, pages = {55:1--55:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-015-6}, ISSN = {1868-8969}, year = {2016}, volume = {57}, editor = {Sankowski, Piotr and Zaroliagis, Christos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.55}, URN = {urn:nbn:de:0030-drops-63971}, doi = {10.4230/LIPIcs.ESA.2016.55}, annote = {Keywords: inapproximability, bipartite expansion, PCP, submodular minimization} }

Document

**Published in:** LIPIcs, Volume 8, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2010)

We study the problem of 2-Catalog Segmentation which is one of the several variants of segmentation problems, introduced by Kleinberg et al., that naturally arise in data mining applications. Formally, given a bipartite graph $G = (U, V, E)$ and parameter $r$, the goal is to output two subsets $V_1, V_2 subseteq V$, each of size $r$, to maximize, $sum_{u \in U} max {|E(u, V_1)|, |E(u, V_2)|},$ where $E(u, V_i)$ is the set of edges between $u$ and the vertices in $V_i$ for $i = 1, 2$. There is a simple 2-approximation for this problem, and stronger approximation factors are known for the special case when $r = |V|/2$. On the other hand, it is known to be NP-hard, and Feige showed a constant factor hardness based on an assumption of average case hardness of random 3SAT.
In this paper we show that there is no PTAS for $2$-Catalog Segmentation assuming that NP does not have subexponential time probabilistic algorithms, i.e. NP $\not\subseteq \cap_{\eps > 0}$ BPTIME($2^{n^\eps}$). In order to prove our result we strengthen the analysis of the Quasi-Random PCP of Khot, which we transform into an instance of $2$-Catalog Segmentation. Our improved analysis of the Quasi-Random PCP proves stronger properties of the PCP which might be useful in other applications.

Rishi Saket. Quasi-Random PCP and Hardness of 2-Catalog Segmentation. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2010). Leibniz International Proceedings in Informatics (LIPIcs), Volume 8, pp. 447-458, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2010)

Copy BibTex To Clipboard

@InProceedings{saket:LIPIcs.FSTTCS.2010.447, author = {Saket, Rishi}, title = {{Quasi-Random PCP and Hardness of 2-Catalog Segmentation}}, booktitle = {IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2010)}, pages = {447--458}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-23-1}, ISSN = {1868-8969}, year = {2010}, volume = {8}, editor = {Lodaya, Kamal and Mahajan, Meena}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2010.447}, URN = {urn:nbn:de:0030-drops-28858}, doi = {10.4230/LIPIcs.FSTTCS.2010.447}, annote = {Keywords: Hardness of Approximation, PCPs, Catalog Segmentation} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail