LIPIcs.FSTTCS.2010.447.pdf
- Filesize: 455 kB
- 12 pages
We study the problem of 2-Catalog Segmentation which is one of the several variants of segmentation problems, introduced by Kleinberg et al., that naturally arise in data mining applications. Formally, given a bipartite graph $G = (U, V, E)$ and parameter $r$, the goal is to output two subsets $V_1, V_2 subseteq V$, each of size $r$, to maximize, $sum_{u \in U} max {|E(u, V_1)|, |E(u, V_2)|},$ where $E(u, V_i)$ is the set of edges between $u$ and the vertices in $V_i$ for $i = 1, 2$. There is a simple 2-approximation for this problem, and stronger approximation factors are known for the special case when $r = |V|/2$. On the other hand, it is known to be NP-hard, and Feige showed a constant factor hardness based on an assumption of average case hardness of random 3SAT. In this paper we show that there is no PTAS for $2$-Catalog Segmentation assuming that NP does not have subexponential time probabilistic algorithms, i.e. NP $\not\subseteq \cap_{\eps > 0}$ BPTIME($2^{n^\eps}$). In order to prove our result we strengthen the analysis of the Quasi-Random PCP of Khot, which we transform into an instance of $2$-Catalog Segmentation. Our improved analysis of the Quasi-Random PCP proves stronger properties of the PCP which might be useful in other applications.
Feedback for Dagstuhl Publishing