Abstract 1 Introduction 2 Preliminaries 3 W[1]-hardness of projective clustering in the Euclidean plane 4 W[2]-hardness of covering points with subspaces 5 XP algorithm for Projective Clustering 6 Conclusion References

Line Cover and Related Problems

Matthias Bentert ORCID TU Berlin, Germany
University of Bergen, Norway
Fedor V. Fomin ORCID University of Bergen, Norway Petr A. Golovach ORCID University of Bergen, Norway Souvik Saha ORCID The Institute of Mathematical Sciences, HBNI, Chennai, India Sanjay Seetharaman ORCID The Institute of Mathematical Sciences, HBNI, Chennai, India Anannya Upasana ORCID The Institute of Mathematical Sciences, HBNI, Chennai, India
Abstract

We study several extensions of the classic Line Cover problem of covering a set of n points in the plane with k lines. Line Cover is known to be NP-hard and our focus is on two natural generalizations: (1) Line Clustering, where the objective is to find k lines in the plane that minimize the sum of squares of distances of a given set of input points to the closest line, and (2) Hyperplane Cover, where the goal is to cover n points in ℝd by k hyperplanes. We also consider the more general Projective Clustering problem, which unifies both of these and has numerous applications in machine learning, data mining, and computational geometry. In this problem one seeks k affine subspaces of dimension r minimizing the sum of squares of distances of a given set of n points in ℝd to the closest point within one of the k affine subspaces.

Our main contributions reveal interesting differences in the parameterized complexity of these problems. While Line Cover is fixed-parameter tractable parameterized by the number k of lines in the solution, we show that Line Clustering is W⁑[1]-hard when parameterized by k and rule out algorithms of running time no⁒(k) under the Exponential Time Hypothesis. Hyperplane Cover is known to be NP-hard even when d=2 and by the work of Langerman and Morin [Discrete & Computational Geometry, 2005], it is FPT parameterized by k and d. We complement this result by establishing that Hyperplane Cover is W⁑[2]-hard when parameterized by only k.

We complement our hardness results by presenting an algorithm for Projective Clustering. We show that this problem is solvable in nπ’ͺ⁒(d⁒k⁒(r+1)) time. Not only does this yield an upper bound for Line Clustering that asymptotically matches our lower bound, but it also significantly extends the seminal work on k-Means Clustering (the special case r=0) by Inaba, Katoh, and Imai [SoCG 1994].

Keywords and phrases:
Point Line Cover, Projective Clustering, W-hardness, XP algorithm
Funding:
Matthias Bentert: Supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 819416) and the German Research Foundation (DFG, grant 1-5003036: SPP 2378 – ReNO-2).
Fedor V. Fomin: Supported by the Research Council of Norway under BWCA project, reference 314528, and European Research Council (ERC) via grant NewPC, reference 101199930.
Petr A. Golovach: Supported by the Research Council of Norway under BWCA (grant no. 314528) and Extreme-Algorithms (grant no 355137) projects.
Copyright and License:
[Uncaptioned image] © Matthias Bentert, Fedor V. Fomin, Petr A. Golovach, Souvik Saha,
Sanjay Seetharaman, and Anannya Upasana; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation β†’ Random projections and metric embeddings
; Theory of computation β†’ Computational geometry ; Theory of computation β†’ Parameterized complexity and exact algorithms ; Theory of computation β†’ Problems, reductions and completeness
Related Version:
Full Version: https://arxiv.org/abs/2512.17268
Editors:
Meena Mahajan, Florin Manea, Annabelle McIver, and Nguyα»…n Kim ThαΊ―ng

1 Introduction

The Line Cover problem asks for the smallest number k of lines that cover a given set of n points in the Euclidean plane ℝ2. This problem is NP-hard and is well-studied in the context of parameterized complexity, particularly because it admits a simple kernelization, often highlighted in introductory lectures on kernelization [15, 22, 34].

In this paper, we study two natural and well-established generalizations of Line Cover, namely Line Clustering and Hyperplane Cover, along with their overarching generalization Projective Clustering. Our results reveal an interesting contrast between the parameterized complexities of these problems and that of Line Cover.

Line Clustering is an optimization variant of Line Cover, in which the goal is to find k lines that best fit the points in the β„“2 norm, rather than merely covering them. In other words, we search for k lines such that the sum of the squares distances from each point to the closest point on one of the lines is minimized.

Line Clustering

Input: A set S={𝐱1,𝐱2,…,𝐱n} of points in ℝ2 and integers k and B.

Question: Are there k lines L1,L2,…,Lk such that βˆ‘i=1nmin1≀j≀kdist(𝐱i,Lj)2≀B?

Figure 1 provides an example of Line Clustering with k=3. In Operations Research, this problem is known as Line Facility Location [33]. The problem has been studied for convex and piecewise linear functions [7] as well as in the weighted setting [12].

Figure 1: An instance of Line Clustering with k=3 and a solution (the three colored lines). The points are colored based on the color of the nearest line in the selected solution. The dashed line show the region where points have equal distance from two solution lines.

The second natural generalization of Line Cover that we consider in this paper is the extension to d-dimensional spaces. In the Hyperplane Cover problem, the task is to cover n points in ℝd by k hyperplanes (affine subspaces of dimension dβˆ’1).

Hyperplane Cover

Input: A set S={𝐱1,𝐱2,…,𝐱n} of points in ℝd and an integer k.

Question: Is it possible to cover all points of S by k hyperplanes?

Langerman and Morin [31] initiated the study of the parameterized complexity of Hyperplane Cover (as part of a more general algorithmic framework). They showed that the problem is FPT when parameterized by k+d, providing several algorithms running in kπ’ͺ⁒(d⁒k)⁒nπ’ͺ⁒(1) time.

Finally, we consider the problem Projective Clustering which extends both Line Clustering and Hyperplane Cover, as well as many other well-studied problems. In this problem, we are given a family S of n points in ℝd and integers r and k; note that we allow points to have the same coordinates. The task is to find a set H of k affine subspaces of dimension r that minimizes the sum of squares of distances of each point in S to the closest point in the union of all subspaces in H. It is also sometimes called Affine Subspace Clustering as the objective is to approximate points using a set of low-dimensional affine subspaces. The difference between linear and affine subspaces is that a linear subspace should always contain the origin of the coordinates. Affine subspaces can always be embedded into linear subspaces of one extra dimension.

Projective Clustering

Input: A multiset S={𝐱1,𝐱2,…,𝐱n} of points in ℝd and integers r, k, and B.

Question: Are there k r-dimensional affine subspaces A1,A2,…,Ak whose union H satisfies βˆ‘i=1ndist(𝐱i,H)2≀B?

Here, dist⁑(𝐱i,H) is the Euclidean distance of the point 𝐱i to the closest point in one of the affine subspaces in H. In particular, Line Clustering is the special case with d=2 and r=1 and Hyperplane Cover is the variant where r=dβˆ’1 and B=0.

Projective Clustering naturally finds many applications in fields such as unsupervised learning [37], data mining [3, 36], artificial intelligence [46], computational biology [37], database management [10], computer vision [38], image segmentation [47], motion segmentation [43], face clustering [24], image processing [25], systems theory [42], and others [20, 44]. It also generalizes a number of fundamental and well-studied problems such as k-Means Clustering [32], Principal Component Analysis (PCA) [19], and Hyperplane Approximation [28].

Our contribution.

First, we show that Line Clustering is W⁑[1]-hard when parameterized by the number k of lines in the solution. Our reduction also implies a lower bound based on the Exponential Time Hypothesis (ETH).

Theorem 1.

Line Clustering is W⁑[1]-hard parameterized by k. Assuming the ETH, it cannot be solved in no⁒(k) time.

We consider Theorem 1 to be particularly interesting for several reasons. First, it underscores a clear contrast between Line Cover– which is FPT when parameterized by k – and Line Clustering. Second, Line Clustering is closely related to k-means clustering, where instead of fitting k lines, one seeks k points (i.e., zero-dimensional subspaces) that best fit the data in the L2 norm. Notably, the parameterized complexity of k-means clustering in ℝ2 (with parameter k) remains a longstanding open question [14].

Our second lower bound concerns Hyperplane Cover. Since in ℝ2 this problem is equivalent to Line Cover, which is NP-complete, we know that the problem is Paraβˆ’NP-complete when parameterized by d. On the other hand, by the work of Langerman and Morin [31], Hyperplane Cover is FPT when parameterized by k and d. Theorem 2 refines this complexity picture by proving that Hyperplane Cover is W⁑[2]-hard when parameterized by k alone (and d is large compared to k).

Theorem 2.

Hyperplane Cover is W⁑[2]-hard parameterized by k. Assuming the ETH, it cannot be solved in no⁒(k) time.

The lower bounds provided in Theorem 1 rule out the existence of an algorithm for Line Clustering of running time no⁒(k). We complement this lower bound by providing an algorithm of running time nO⁒(k). The algorithm is provided for the more general problem Projective Clustering. To the best of our knowledge, this is the first XP algorithm for Projective Clustering parameterized by k and d.111As is common in computational geometry, we assume a real RAM model for computation. Note that r≀d.

Theorem 3.

Projective Clustering can be solved in nπ’ͺ⁒(min⁑{d⁒k⁒(r+1),d⁒k⁒(dβˆ’r+1)}) time.

Since k-Means Clustering is the (very) special case of Projective Clustering with r=0, Theorem 3 significantly extends the work of Inaba et al. [27], who gave an algorithm for k-Means Clustering running in nπ’ͺ⁒(k⁒d) time. It is also worth noting that the tools we use to prove Theorem 3 are quite different from those employed by Inaba et al., who relied on weighted Voronoi diagrams. In contrast, the proof of Theorem 3 leverages a fundamental result from computational algebraic geometry, which computes a set of sample points that meets every realizable sign condition of a family of polynomials restricted to an algebraic set.

Related Work.

Line Cover is NP-hard and APX-hard [4, 9, 33]. A number of results in parameterized complexity and kernelization in the literature is devoted to Line Cover and Hyperplane Cover. Langerman and Morin [31] proposed an kd⁒k+d⁒nπ’ͺ⁒(1)-time algorithm for Hyperplane Cover. Later, the running time was improved by Wang et al. [45] to k(dβˆ’1)⁒k1.3k⁒nπ’ͺ⁒(1). Afshani et al. [1] gave an algorithm of running time O⁒(2n). They also improved on the FPT algorithm for Hyperplane Cover in ℝ3. Their improved algorithm runs in (C⁒k2log⁑k1/5)k⁒nπ’ͺ⁒(1) time. They also presented a (C⁒klog⁑k)k⁒nπ’ͺ⁒(1)-time algorithm for Line Cover. Kratsch et al. show that the well-known π’ͺ⁒(k2) kernel for Line Cover is tight under standard complexity assumptions [29]. A similar result exists for Hyperplane Cover [8].

Due to its importance in theory as well as applications, the study of Projective Clustering enjoys a long and rich history. Various approaches have been used to solve Projective Clustering: kernel sets [2, 23], coresets [23, 41], total sensitivity [21, 30, 41], and dimension reduction via sampling [16, 39].

2 Preliminaries

An r-dimensional affine subspace A (also called an r-flat) of a d-dimensional Euclidean space ℝd is a subset of ℝd of the form 𝐩+V={𝐩+𝐯:𝐯∈V}, where V is an r-dimensional linear subspace of ℝd and 𝐩 is a fixed point in ℝd. In other words, an r-dimensional affine subspace A is a set of points 𝐱=(x1,x2,…,xd)Tβˆˆβ„d represented by the system of linear equations 𝐀𝐱=𝐩, where π€βˆˆβ„(dβˆ’r)Γ—d is a matrix representing the coefficients of the equations and π©βˆˆβ„(dβˆ’r)Γ—1 is a vector of constants.

For a point or vector 𝐱=(x1,x2,…,xd)T, we use 𝐱⁒[i] to denote the ith entry xi. For two points 𝐱=(x1,x2,…,xd)T,𝐲=(y1,y2,…,yd)Tβˆˆβ„d, we denote the Euclidean distance between 𝐱 and 𝐲 by dist⁑(𝐱,𝐲)=βˆ‘i=1d(x⁒[i]βˆ’y⁒[i])2. The Euclidean distance dist⁑(𝐱,A) from a point 𝐱 to an affine subspace A is the distance between 𝐱 and its orthogonal projection on the subspace A (also known as the perpendicular distance). Let H be a union of k affine subspaces A1,A2,…,Ak. For a point π±βˆˆβ„d, we define the distance dist⁑(𝐱,H) as the Euclidean distance from 𝐱 to the closest Ai, that is dist⁑(𝐱,H)=min1≀i≀k⁑dist⁑(𝐱,Ai). We denote the square of the Frobenius norm of a matrix A by ‖𝐀‖F2=βˆ‘i,jai,j2.

Complexity.

A parameterized problem is a language QβŠ†Ξ£βˆ—Γ—β„•, where Ξ£βˆ— is the set of strings over a finite alphabet Ξ£. Specifically, an input of Q is a pair (I,k), where IβˆˆΞ£βˆ— and kβˆˆβ„•; k is the parameter of the problem. A parameterized problem Q is fixed-parameter tractable (FPT) if it can be decided whether (I,k)∈Q in f⁒(k)⁒|I|π’ͺ⁒(1) time for some computable function f that depends only on the parameter k. The parameterized complexity class FPT consists of all fixed-parameter tractable problems. A parameterized problem LβˆˆΞ£βˆ—Γ—β„• is called slice-wise polynomial (XP) if there exists an algorithm π’œ and two computable functions f,g:β„•β†’β„• such that, given an instance (I,k)βˆˆΞ£βˆ—Γ—β„•, the algorithm π’œ correctly decides whether (x,k)∈L in f⁒(k)β‹…|I|g⁒(k) time. The parameterized complexity class containing all slice-wise polynomial problems is called XP.

The 𝖢-hierarchy is a collection of computational complexity classes; we omit the technical definitions here. It is widely believed that FPTβ‰ W⁑[1], and hence, if a problem is hard for a class W⁑[i] with iβ‰₯1, then it is considered to be fixed-parameter intractable. For a detailed introduction to parameterized complexity, we refer readers to the textbooks by Cygan et al. and by Downey and Fellows [15, 18]. We also provide conditional lower bounds by making use of the following complexity hypothesis formulated by Impagliazzo, Paturi, and Zane [26].

Exponential Time Hypothesis (ETH): There is a positive real s such that 3-CNF-SAT with n variables and m clauses cannot be solved in time 2s⁒n⁒(n+m)π’ͺ⁒(1).

3 W[1]-hardness of projective clustering in the Euclidean plane

In this section, we show that Line Clustering is W⁑[1]-hard when parameterized by the number k of lines in the solution.

Theorem 1. [Restated, see original statement.]

Line Clustering is W⁑[1]-hard parameterized by k. Assuming the ETH, it cannot be solved in no⁒(k) time.

Proof.

We give a reduction from Regular Multicolored Independent Set. Therein, one is given a graph G=(V,E) and a partition of the n vertices into V1,V2,…,Vβ„“ (often referred to as colors of vertices) such that |Vi|=Ξ½=n/β„“ for each i∈[β„“] and each vertex v has exactly q neighbors (none of which have the same color as v). The question is whether there exists an independent set containing exactly one vertex of each color. It is known that Regular Multicolored Independent Set is W⁑[1]-hard parameterized by β„“ [15, Proposition 13.5.] and cannot be solved in no⁒(β„“) time unless the ETH fails [11]. We will assume without loss of generality that Ξ½ is divisible by 4, Ξ½>β„“3, β„“>10, and Vi={w1i,w2i,…,wΞ½i}.

The crucial step is a polynomial gap-reduction that constructs an instance (S,k,B) of Line Clustering where S is a multiset of points in ℝ2 (that is, we allow points to have the same coordinates) with integer coordinates and k=2⁒ℓ+4 with the following property:

  • β– 

    If the original instance of Regular Multicolored Independent Set is a yes-instance then there are k lines such that the sum of squares of distances from all points in S to the closest line in the solution is at most B.

  • β– 

    Otherwise, if we have a no-instance of Regular Multicolored Independent Set then for any k lines, the sum of squares of distances from all points in S to the closest line is at least B+2.

We use this gap-reduction to argue that Line Clustering is W[1]-hard when S is a set of points. For this, we apply the gap-reduction to an instance G of Regular Multicolored Independent Set and denote by (S,k,B) the obtained instance of Line Clustering where N is the number of points. Let Ξ΄=1/3⁒B⁒N. Then for every inclusion maximal multiset of points XβŠ†S with the same coordinates (x,y), we construct a set of |X| points YX with distinct rational coordinates such that every point p∈YX is at distance at most Ξ΄ from (x,y). As |X|≀N, YX can be constructed in such a way that the denominators of the coordinates of each point are at most N/Ξ΄=3⁒B⁒N2, and the absolute values of the numerators are at most N+3⁒B⁒N2⁒max⁑{|x|,|y|}. Thus, the encoding of the coordinates of the point of YX is polynomial. Notice also that the sets YX constructed for distinct XβŠ†S are disjoint. We denote by Sβ€² the obtained set of distinct points ℝ2, and let Bβ€²=B+1.

Consider the instance (Sβ€²,k,Bβ€²) of Line Clustering. Suppose that G has a multicolored independent set. Then, there are k lines such that the sum of squares of distances from all points in S to the closest line is at most B. Suppose that a point p∈S is on distance h from the closest line. Because p is at distance at most Ξ΄ from the corresponding point pβ€²βˆˆSβ€², the distance from pβ€² to the closest line is at most h+Ξ΄. Since the distance between each point of S and the closest line is at most B, (h+Ξ΄)2=h2+2⁒h⁒δ+Ξ΄2≀h2+2⁒B⁒δ+Ξ΄2. By taking the sum over all points in Sβ€², we obtain that the sum of squares of distances from the points in Sβ€² to the closest line is at most B+N⁒δ2+2⁒N⁒B⁒δ=B+1/9⁒B2⁒N+2/3<B+1=Bβ€². Hence, (Sβ€²,k,Bβ€²) is a yes-instance. Assume now that G has no multicolored independent set. Then, for any any k lines, the sum of squares of distances from all points in S to the closest line is at least B+2. If a point p∈S is at distance h from some line then the corresponding point pβ€²βˆˆSβ€² is at distance at least hβˆ’Ξ΄ from the same line. For the square of the distance, we have that it is at least (hβˆ’Ξ΄)2=h2βˆ’2⁒h⁒δ+Ξ΄2β‰₯h2βˆ’2⁒B⁒δ+Ξ΄2β‰₯h2βˆ’2⁒B⁒δ. This implies that for any k lines, the sum of squares of distances from all points in Sβ€² to the closest line is at least (B+2)βˆ’2⁒B⁒N⁒δ>(B+2)βˆ’1=B+1=Bβ€². Thus, (Sβ€²,k,Bβ€²) is a no-instance of Line Clustering. This proves that Line Clustering is W[1]-hard when S is a set of points.

In the remaining part of the proof, we provide the gap-reduction for a multiset of input points. Before we present the reduction in detail, we first give a high-level overview. Our aim is to build a grid-like structure where any optimal solution contains β„“ vertical and β„“ horizontal lines. Each horizontal line and each vertical line correspond to picking a vertex in a solution. The aim is then to construct a set of points such that (i) for each horizontal line in the solution there is a vertical line in the solution that encodes the same vertex (to ensure that a solution selects precisely β„“ vertices) and (ii) for each pair of horizontal and vertical lines in a solution, the corresponding vertices are not adjacent in the input graph. We do this by placing points on the intersection of two such lines whenever the two respective vertices share an edge or are two distinct vertices of the same color. Otherwise, we place a point on the horizontal line close to the intersection. Imagine for now that these points have distinct distances from a β€œsolution line” such that no vertical line contains more than one such point. See Figure 2 for an example.

Figure 2: A simple input instance of Regular Multicolored Independent Set with β„“=2, Ξ½=3, and q=1 on the left. The right shows a set of points such that any set of β„“ vertical and β„“ horizontal lines covering at least β„“2⁒ν+ℓ⁒(Ξ½βˆ’1+q)=18 points corresponds to a colorful independent set for the left instance. The colored dashed lines represent the independent set {a,e}. Note that whenever a vertical and a horizontal line of the solution cross, there are no points on the intersection.

Now, any independent set of size β„“ containing exactly one vertex from each set Vi corresponds to a set of β„“ horizontal and β„“ vertical lines containing a maximum number of points.

With this intuition in mind, there are two main obstacles to overcome. First, we want to ensure that the solution lines are actually (almost) horizontal and vertical and second, covering a maximum number of points is different from minimizing the sum of squares of distances from each point to the closest line. We start with the second obstacle. Assume that there are two lines (one vertical and one horizontal) that is part of any optimal solution that is far enough away from the previously described set of points so that they are never the closest line to any of these points. Let W be some sufficiently large (but also not too large) number. We place all vertical lines corresponding to vertices within the same set Vi closer together than lines corresponding to vertices in different sets (as already indicated in Figure 2). We do the same for all horizontal lines. By making the distance ds between lines corresponding to vertices in different sets large enough, we can ensure that for any set Vi any optimal solution contains one horizontal line and one vertical line that corresponds to a vertex in Vi (assuming that all solution lines are horizontal or vertical). Next, we compute for each horizontal line the sum of squares of distances of all points on horizontal lines corresponding to vertices within the same set Vi except for those points that lie on the vertical line corresponding to the same vertex. For a line h, let that number be ϕ⁒(h). Then, we place Wβˆ’Ο•β’(h) points on line h at a distance of one from the vertical line that is part of any optimal solution. We also place W points on each vertical line corresponding to picking a vertex at distance one from the horizontal line that is part of any solution and can now adjust the previous construction such that all points are covered by 2⁒ℓ⁒ν vertical lines, that is, we no longer need to imagine that points placed because of edges in the input graph (or for pairs of vertices in the same set Vi) have distinct distances. See Figure 3 for an illustration of the updated construction.

Figure 3: The updated construction (for the same instance as in Figure 2). The violet lines represent lines that are part of any optimal solution and the blue dots with numbers next to them represent a collection of points at the same position. The value of ds is at least 500. The colored dashed lines (together with the violet lines) represent an optimal solution corresponding to the independent set {a,e}. The horizontal line representing vertex a is the closest line in the solution to 4 points each on the horizontal lines representing b and c. Since the distance to points on the line for b is one, this adds a cost of four to the budget. Points on the line for c have distance 2, so each points contributes 4. Summed up over all of these points, this gives an additional budget of 20 which is balanced out by the blue vertex on the line representing a. For the edge representing e, there are 8 points at distance one (four on the lines for d and f, respectively). Note that the blue point on the line for e represents 8 points. Figure not drawn to scale due to the orders of magnitude difference in the distances.

We will show that these newly added points ensure that any set of β„“ vertical and β„“ horizontal lines that minimize the sum of squares of distances to all points are also lines that cover a maximum number of points.

Finally, we want to ensure that lines in the solution are horizontal and/or vertical. Unfortunately, we were not able to ensure this property but the following construction ensures that all solution lines are almost vertical or horizontal. Consider the construction in Figure 4 where dβ„“ is much larger than ds. We will show that any line that covers as many points as there are points in one row or column of the constructed point set, has to contain the points of exactly one row or one column. We will then pick a suitable number p, duplicate each point in the construction in Figure 3 p times, and set the budget B correspondingly such that any line that encodes a vertex in some set Vi has a sum of squares of distances to points in the ith row or column of the construction in Figure 4 that is negligible while any line that is not almost vertical or horizontal has a point in any row or column of the construction such that the distance between this point and the line alone is larger than the budget B. We also place a lot of vertices in the eight β€œcorners” of the construction to ensure that there are two horizontal and two vertical lines that are part of any solution. This concludes the overview of the reduction. We now present the construction in more detail.

We start by defining p=n10, W=n30, ds=n40, dβ„“=n90, and the functions

ϑ⁒(i) =βˆ‘a=1i(3⁒(iβˆ’a))2+βˆ‘b=iΞ½(3⁒(Ξ½βˆ’b))2, φ⁒(i) =p⁒ℓ⁒(Ξ½βˆ’1)⁒ϑ⁒(i),Β and ⁒φ′⁒(i)=p⁒ℓ⁒ν⁒ϑ⁒(i).

We set k=2⁒ℓ+4 and B=n7+(nβˆ’β„“)⁒W+β„“β’βˆ‘i=1Ξ½(W+φ⁒(i))βˆ’β„“β’W+ℓ⁒p⁒(nβˆ’Ξ½+1βˆ’qβˆ’β„“). Note that B≀n32. We then construct a set F of 8⁒n90+k2+2⁒k points as follows. We place points in a k2Γ—(k+2) grid Gh where the vertical distance between points is ds and the horizontal distance is dβ„“. We leave (β„“+1)⁒ds extra space between columns k+2/2 and k+2/2+1 (this is where we will place the main part of the construction depicted in Figure 3). We place n90 points at each of the four corners of the grid (on top of the already existing point). We then repeat the process with a (k+2)Γ—k2 grid Gv where the vertical distance between points is dβ„“ and the horizontal distance is ds. We place the two grids such that the additional extra space forms a square in the middle of both grids. We call the set of all points placed so far F and the vertical/horizontal lines that each go exactly through 2⁒n90+k+2 points the fixed lines (we will show later that we can assume these four lines to be part of any optimal solution). An illustration of the points in F is given in Figure 4.

Figure 4: An illustration of the construction of points in F for β„“=3 (and k=10). The blue dots represent n90+1 points at the same coordinate. The four purple lines show the fixed lines and the rest of the construction happens in the box in the middle. Figure not drawn to scale as dβ„“ is larger than ds by many orders of magnitude.

Next, we construct the rest of the instance in the middle gap. We start with β„“ bundles of Ξ½ horizontal lines each. That is, for each i∈[β„“] and each j∈[Ξ½], we define a horizontal line hij. The height of the Ξ½/2th line in the ith bundle is equal to the height of points in F in the i+1st row of Gh. The distance between hij and hij+1 is 3 where hij is the higher line. Next, we define vertical lines vij and sij for each i∈[β„“] and each j∈[Ξ½] as follows. The x-position of line viΞ½/2 is equal to the x-positions of points in F in the i+1st column of Gv. The distance between vij and vij+1 is 10⁒n2 where vij is to the left. Line sij is one to the left of vij. An illustration of these lines (and the input points we will place on these lines next) is given in Figure 5.

Figure 5: An illustration of the construction of points in Sβˆ–F for an instance with β„“=2 and Ξ½=3 vertices of each color. Black dots represent p points at the same coordinate, blue dots represent roughly W points at the same coordinate, and the violet lines represent the fixed lines. Figure not drawn to scale since ds is too big.

For each pair wji,wjβ€²iβ€² of vertices in the input graph G with iβ‰ iβ€², we place p points on the intersection of hij and siβ€²jβ€² if {wji,wjβ€²iβ€²}∈E and on the intersection of hij and viβ€²jβ€² otherwise. Note that we also place points symmetrically on the intersection of hiβ€²jβ€² and sij if the two vertices are adjacent and on the intersection of hiβ€²jβ€² and vij if they are not. For i=iβ€², we place p points on the intersection of hij and viβ€²jβ€² if jβ‰ jβ€² and on the intersection of hij and siβ€²jβ€² if j=jβ€². Let X be the set of points placed in the previous step. Note that we place exactly n⁒p=n11 points of X on each line hij, exactly (q+Ξ½βˆ’1)⁒p<n11 points on each line sij, and exactly (nβˆ’qβˆ’Ξ½+1)⁒p<n11 points on each line vij. Finally, for each i∈[β„“] and each j∈[Ξ½], we place W4 points on sij one above and below each of the two horizontal fixed lines (that is, we place W points on each line sij). Let Zv be the set of all of these points. Note that |Zv|=n⁒W. We also place W+φ⁒(j)4β‰₯W4 points on each line hij one to the left and one to the right of each of the two vertical fixed lines. Let Zh be the set of all these points. Note that |Zh|=ℓ⁒(βˆ‘j=1Ξ½W+φ⁒(j)). This concludes the construction.

Since the construction can be computed in polynomial (in n) time as there are a polynomial number of points, all coordinates are integers and the largest distance between points is in O⁒(ℓ⁒dβ„“)βŠ†poly⁑(n) and since k∈O⁒(β„“), it only remains to show correctness.

Claim 4 (⋆).

(G,(V1,…,Vβ„“)) is a yes-instance of Regular Multicolored Independent Set if and only if (S,k,B) is a yes-instance of Line Clustering.

For space reasons, the proof of correctness is deferred to a full version. β—€

4 W[2]-hardness of covering points with subspaces

In this section, we show that Hyperplane Cover is W⁑[2]-hard when parameterized by k even if B=0. To this end, we present a parameterized reduction from Dominating Set parameterized by solution size kβ€². In Dominating Set, one is given a graph G=(V,E) and an integer kβ€² and the question is whether there exists a subset S of vertices such that S intersects with the closed neighborhood of every vertex. Dominating Set parameterized by kβ€² is known to be W[2]-hard [15]. The basic idea of the following reduction is to capture the closed neighborhood of a vertex through an affine subspace.

Theorem 2. [Restated, see original statement.]

Hyperplane Cover is W⁑[2]-hard parameterized by k. Assuming the ETH, it cannot be solved in no⁒(k) time.

Consider an instance (G=(V,E),kβ€²) of Dominating Set. We assume without loss of generality that V=[d], that d,kβ€²>1, and that no vertex in G has degree dβˆ’1 (in which case the problem becomes trivial). For each v∈[d], we create d⁒kβ€² points {𝐩𝐯,𝐒}i∈[d⁒kβ€²], each in ℝd, as follows: for each i∈[d⁒kβ€²], we set the jt⁒h coordinate of 𝐩𝐯,𝐒 as 1 (that is, pv,i⁒[j]=1) if j∈N⁒[v]; the remaining entries will be set later. There are d2⁒kβ€² points in total and we interpret them as an d2⁒kβ€²Γ—d matrix M. Consider the d2⁒kβ€²Γ—d2⁒kβ€² Vandermonde matrix 𝐕 defined by the numbers 2,…,d2⁒kβ€²+1. A key component of the reduction is the fact that the determinant of any square sub-matrix of 𝐕 is non-zero [35]. Consider columns 2 to d+1 in 𝐕, that is, the sub-matrix of 𝐕 containing the columns from 2 to 2+dβˆ’1 and all rows. Now, we set any unset entries M⁒[i,j] as 𝐕⁒[i,1+j]. The matrix M now defines the d2⁒kβ€² points 𝐩𝟏,𝟏,…,𝐩𝐝,𝐝𝐀′ in the instance. Next, we set r=dβˆ’1, B=0, and k=kβ€², that is, the number of allowed hyperplanes in a solution equals the number of allowed vertices in a dominating set. The constructed instance of Hyperplane Cover is therefore ({𝐩𝟏,𝟏,…,𝐩𝐝,𝐝𝐀},k). Notice that even if we change the norm considered over the distances between points and their closest subspaces, the goal remains to find k affine subspaces of dimension dβˆ’1 that cover all the points since the distances are nonnegative. The proof of the following lemma showing the correctness is deferred to a full version due to space constraints.

Lemma 5 (⋆).

The instance (G=(V,E),kβ€²) is a yes-instance of Dominating Set if and only if ({𝐩𝟏,𝟏,…,𝐩𝐝,𝐝𝐀},k) is a yes-instance of Hyperplane Cover.

We mention that the reduction above also rules out a no⁒(k)-time algorithm for Hyperplane Cover unless the ETH fails [11].

5 XP algorithm for Projective Clustering

In this section, we prove Theorem 3. Its proof uses a fundamental result from algebraic geometry. We need a few definitions to state this result. Let ℝ⁒[X1,X2,…,Xd] be the ring of polynomials in variables X1,…,Xd with coefficients in ℝ. Let V={xβˆˆβ„d∣Q⁒(x1,…,xd)=0} denote an algebraic set in ℝd defined by Qβˆˆβ„β’[X1,X2,…,Xd]. The sign condition for a set of polynomials is defined as follows. Let 𝒫={P1,…⁒Ps}βŠ†β„β’[X1,X2,…,Xd] be a subset of s polynomials. The sign condition for 𝒫 is specified by a sign vector Οƒβˆˆ{βˆ’1,0,+1}s and the sign condition is non-empty over V with respect to 𝒫 if there is x∈V such that Οƒ=(π—Œπ—‚π—€π—‡β’(P1⁒(x)),…,π—Œπ—‚π—€π—‡β’(Ps⁒(x))), where π—Œπ—‚π—€π—‡β’(y) is the sign function defined as

π—Œπ—‚π—€π—‡β’(y)={1,Β if ⁒y>0,0,Β if ⁒y=0,Β andβˆ’1,Β if ⁒y<0,

for yβˆˆβ„.

Refer to caption
Figure 6: A family of 3 polynomials on a single variable x. The red dotted line partitions the x-axis into cells. The sign condition of some cells are given. The yellow box shows the sign condition of the cell consisting only of the point βˆ’1. The cells are (βˆ’βˆž,βˆ’3),{βˆ’3},(βˆ’3,βˆ’2) and so on.

The realization space of Οƒβˆˆ{βˆ’1,0,+1}s over V is the set

R⁒(Οƒ)={x∣x∈V,Οƒ=(π—Œπ—‚π—€π—‡β’(P1⁒(x)),…,π—Œπ—‚π—€π—‡β’(Ps⁒(x)))}.

If R⁒(Οƒ) is not empty, then each of its non-empty semi-algebraically connected (which is equivalent to just connected on semi-algebraic sets as proven in [5, Theorem 5.22]) components is a cell of 𝒫 over V. Figure 6 shows an example of a family of polynomials on a single variable partitioning the x-axis into cells. For an algebraic set W, its real dimension is the maximal integer dβ€² such that there is a homeomorphism of [0,1]dβ€² in W. Naturally if WβŠ‚β„d, then dβ€²<d. The following theorem by Basu et al. [5] gives an algorithm to compute a point in each cell of 𝒫 over V.

Proposition 6 ([5, Theorem 13.22]).

Let V be an algebraic set in ℝd of real dimension dβ€² defined by Q⁒(X1,…,Xd)=0, where Q is a polynomial in ℝ⁒[X1,…,Xd] of degree at most b, and let π’«βŠ‚β„β’[X1,X2,…,Xd] be a finite set of s polynomials with each Pβˆˆπ’« also of degree at most b. Let D be a ring generated by the coefficients of Q and the polynomials in 𝒫. There is an algorithm that takes as input Q,dβ€² and 𝒫 and computes a set of points meeting every non-empty cell of V over 𝒫. The algorithm uses at most sd′⁒bπ’ͺ⁒(d) arithmetic operations in D.

We next present our algorithm. We start by analyzing the special case where k=1. Here and in the more general case where k>1, we will distinguish between the cases r=0 and rβ‰₯1. Recall that for r=0, the problem is equivalent to the k-means problem. For k=1, the solution is the centroid of the given points which can be computed in π’ͺ⁒(n⁒d) time. The centroid C of a set of points {xi}i=1n in d-dimensional space is the mean of all the points. It is calculated using the formula C=1nβ’βˆ‘i=1nxi. For r>0, we have the following proposition.

Proposition 7.

For k=1, Projective Clustering is solvable in time π’ͺ⁒(d⁒n2+d3).

The proof of Proposition 7 follows from the well-known reduction that allows reducing the problem of finding the affine subspace that minimizes the sum of squares of distances to a given point set to the problem of finding a subspace with the same property for a similar point set. Indeed, any r-dimensional affine space that minimizes the sum of squared perpendicular distances to the data points must pass through the centroid of these points (see, e.g., [6]). Thus, by centering the data – i.e., by subtracting the centroid from each data point – we ensure that the best-fit affine subspace passes through the origin. Consequently, the problem of finding the best-fit r-dimensional subspace is identical to the classic principal component analysis (PCA). By the Eckart-Young theorem [19], PCA can be solved efficiently via Singular Value Decomposition (SVD), which can be implemented in π’ͺ⁒(d⁒n2+d3) time [40].

Now, we will provide a brief outline of our algorithm for Projective Clustering for k>1. Any optimal solution A1,A2,…⁒Ak will partition the set of points into k clusters where each cluster corresponds to an affine subspace Ai which is the closest affine subspace for each point in that cluster. Thus, a trivial algorithm will be to guess the partition of the input points into k clusters and find the k affine subspaces by running the algorithm of Proposition 7 for each cluster. Since the number of partitions of the n points is at most kn we have reduced Projective Clustering to kn instances of PCA. In order to reduce the search space significantly, we make use of Proposition 6.

We parameterize each of the k unknown affine subspaces of dimension r by r+1 d-dimensional variables. For each input point 𝐩 and two unknown affine subspaces Ai and Aj, we can write as a polynomial inequality the condition that 𝐩 is closer to Ai than to Aj. With this idea, we encode the problem as the solution to systems of inequalities of n⁒k2 polynomials of degree at most 4 and real dimension d⁒k⁒(r+1). For affine subspaces corresponding to variables from the same cell of the corresponding algebraic set, the set of closest points are the same. This defines the partition of the points into clusters. For each such cluster we can find the best fit by making use of Proposition 7. Finally, by Proposition 6, the number of such partitions is nπ’ͺ⁒(d⁒k⁒(r+1)).

Theorem 3. [Restated, see original statement.]

Projective Clustering can be solved in nπ’ͺ⁒(min⁑{d⁒k⁒(r+1),d⁒k⁒(dβˆ’r+1)}) time.

Proof.

We distinguish between two cases: r=0 and r>0. In the case where r=0, note that nπ’ͺ⁒(d⁒k⁒(r+1))=nπ’ͺ⁒(d⁒k). Moreover, the case is equivalent to k-Means. We can therefore use the nπ’ͺ⁒(d⁒k)-time algorithm of Inaba et al. [27].

We next explain our algorithm for the case r>0. We will map k-tuples of r-dimensional affine subspaces with points of a certain algebraic set. We can represent an r-dimensional affine subspace of ℝd by a dΓ—(r+1) matrix 𝐕 where the first r columns represent an orthonormal basis of a linear subspace of dimension r and column (r+1) represents the offset point. Consider the matrix space ℝdΓ—k⁒(r+1) and a matrix π•βˆˆβ„dΓ—k⁒(r+1). Let 𝐕={vi⁒j} for 1≀i≀d and 1≀j≀k⁒(r+1). Consider the following polynomial conditions.

Qi,j,hO⁒(𝐕) β‰”βˆ‘β„“=1dvℓ⁒i⁒vℓ⁒j=0⁒ for each ⁒h∈[0,kβˆ’1]⁒ and each ⁒h⁒r+1≀i<j≀(h+1)⁒r
QiN⁒(𝐕) ≔(βˆ‘β„“=1dvℓ⁒i2)βˆ’1=0⁒ for each ⁒1≀i≀k⁒r

The condition Qi,j,hO⁒(𝐕)=0 requires the columns i,j of 𝐕 where h⁒r+1≀i<j≀(h+1)⁒r to be pairwise orthogonal and the condition QiN⁒(𝐕)=0 requires the column i to have length 1. We will next show how a matrix π•βˆˆβ„dΓ—k⁒(r+1) which satisfies all of the above conditions represents k-affine subspaces of dimension r. For h∈[0,kβˆ’1] the columns h⁒r+1 to (h+1)⁒r represent the orthonormal basis of the linear subspace corresponding to the hth affine subspace of rank r and column (k⁒r+h+1) represents the corresponding offset point. We combine all of the above equations into a single one by taking the sum of squares

Q⁒(𝐕)=βˆ‘h⁒r+1≀i<j≀(h+1)⁒rh∈[k](Qi,j,hO⁒(𝐕))2+βˆ‘i=1k⁒r(QiN⁒(𝐕))2. (1)

Note that Q⁒(𝐕)=0 if and only if Qi,j,hO⁒(𝐕)=0 for all h,i,j and QiN⁒(𝐕)=0 for all i. Consider the algebraic set WβŠ†β„dΓ—k⁒(r+1) of all matrices satisfying Equation 1. We view each element of W as a representation of k affine subspaces of dimension r. Let c⁒o⁒l1,c⁒o⁒l2,…⁒c⁒o⁒ldΓ—k⁒(r+1) be the columns of an element π•βˆˆW. Then, the hth affine subspace is represented by the linear subspace spanned by the columns c⁒o⁒lh⁒r+1 to c⁒o⁒l(h+1)⁒r+1 and the point corresponding to the column c⁒o⁒lk⁒r+h+1. Let 𝐁𝐑 denote the submatrix formed by columns c⁒o⁒lh⁒r+1 to c⁒o⁒l(h+1)⁒r+1. The squared distance from a point π²βˆˆβ„d to an r-dimensional affine subspace 𝐩+𝐁⁒x is given by β€–π²βˆ’π©βˆ’ππT⁒𝐲‖F.

Now consider the family of polynomials 𝒫={Pi,j,f}1≀i<j≀k,1≀f≀n defined over W by

Pi,j,f⁒(𝐕)=‖𝐱fβˆ’c⁒o⁒lk⁒r+i+1βˆ’ππ’β’ππ’T⁒𝐱fβ€–F2βˆ’β€–π±fβˆ’c⁒o⁒lk⁒r+j+1βˆ’ππ£β’ππ£T⁒𝐱fβ€–F2.

The polynomial Pi,j,f⁒(𝐕)>0 if the jth affine space represented by 𝐕 is closer to 𝐱f than the ith affine space represented by 𝐕 and vice-versa. For any particular π•βˆˆW, the sign condition with respect to 𝒫 gives an ordering of the affine subspaces with respect to the Euclidean distance from each of the input points. Hence, we can partition the n input points based on the nearest affine subspace. Let π’ž be the partition of W on cells over 𝒫. For each cell C, the sign condition with respect to 𝒫 is constant over C, which implies that the partitioning of the input points is the same over all points in C. Also, let A1,A2,…⁒Ak be an optimal solution for Projective Clustering. We can represent Ai by 𝐩𝐒 and 𝐁𝐒 where ππ’βˆˆβ„dΓ—r and the columns of 𝐁𝐒 are orthonormal. Here, 𝐩𝐒 is the offset point and 𝐁𝐒 is the r-dimensional linear subspace. Hence, there exists π•βˆˆW that represents A1,A2,…⁒Ak. Thus if we run the classic PCA algorithm for each partition generated by every non-empty cell C, we will find an optimal solution. Thus our algorithm proceeds as follows:

  • β– 

    Use the algorithm from Proposition 6 to obtain a point 𝐕𝐂 from each cell C of the algebraic set W over the family of polynomials 𝒫.

  • β– 

    Run the PCA algorithm for each of the k clusters generated by 𝐕𝐂 to get k affine subspaces.

  • β– 

    Return the k-affine subspaces that minimize the sum of squares of distances.

The degree of Q and polynomials from 𝒫 is at most 4, |𝒫|=n⁒k2, and the real dimension of W is at most d⁒k⁒(r+1) which is the dimension of ℝdΓ—k⁒(r+1)βŠ‡W. The algorithm from Proposition 6 does at most t=(n⁒k2)d⁒k⁒(r+1)⁒2π’ͺ⁒(d⁒k⁒(r+1)) operations and produces at most t points. Our algorithm runs k instances of PCA for each such point and hence we have reduced our problem to solving k⁒(n⁒k2)d⁒k⁒(r+1)⁒2π’ͺ⁒(d⁒k⁒(r+1))=nπ’ͺ⁒(d⁒k⁒(r+1)) instances of PCA. Since PCA can be solved in polynomial time the overall running time of our algorithm is in nπ’ͺ⁒(d⁒k⁒(r+1)).

We can reduce the problem to solving k⁒(n⁒k2)d⁒k⁒(dβˆ’r+1)⁒2π’ͺ⁒(d⁒k⁒(d⁒r+1))=nπ’ͺ⁒(d⁒k⁒(dβˆ’r+1)) instances of PCA using a slightly different characterization of the k r-dimensional affine subspaces. Note that every r-dimensional affine subspace of a d-dimensional Euclidean space ℝd is defined by y=𝐩+𝐕⁒x, where 𝐩 is a point in ℝd and π•βˆˆβ„dΓ—r represents the orthonormal basis of an r-dimensional linear subspace of ℝd. Let π•π‚βˆˆβ„dΓ—(dβˆ’r) be the orthogonal complement of 𝐕. Thus for every r-dimensional affine subspace A defined by y=𝐩+𝐕⁒x, we have the corresponding unique affine subspace π–Όπ—ˆπ—†π—‰β’(A) defined by y=𝐩+𝐕𝐂⁒x. The points in the algebraic set will now represent π–Όπ—ˆπ—†π—‰β’(A) instead of the affine subspace A.

To this end, we consider the matrix space ℝdΓ—k⁒(dβˆ’r+1) and a matrix π•π‚βˆˆβ„dΓ—k⁒(dβˆ’r+1). Let 𝐕𝐂={vi⁒j}i,j and consider the following polynomial conditions:

Q~i,j,hO⁒(𝐕𝐂)β‰”βˆ‘β„“=1dvℓ⁒i⁒vℓ⁒j=0⁒ for each ⁒h∈[0,kβˆ’1]⁒ and
each ⁒h⁒(dβˆ’r)+1≀i<j≀(h+1)⁒(dβˆ’r)
Q~iN⁒(𝐕𝐂)≔(βˆ‘β„“=1dvℓ⁒i2)βˆ’1=0⁒ for each ⁒1≀i≀k⁒(dβˆ’r)

As above, we combine all equations into the following equivalent one:

Q~⁒(𝐕𝐂)=βˆ‘h⁒(dβˆ’r)+1≀i<j≀(h+1)⁒(dβˆ’r)h∈[k](Q~i,j,hO⁒(𝐕𝐂))2+βˆ‘i=1k⁒(dβˆ’r)(Q~iN⁒(𝐕𝐂))2.

Again, let W~βŠ†β„dΓ—k⁒(dβˆ’r+1) be the algebraic set of all matrices satisfying the above equation. We view each element of W~ as a representation of k affine subspaces of dimension dβˆ’r. For every element π•π‚βˆˆW~ and for each h∈[0,kβˆ’1], the columns h⁒(dβˆ’r)+1 to (h+1)⁒(dβˆ’r) represents the basis of the linear subspace corresponding to the hth affine subspace of rank dβˆ’r and column k⁒(dβˆ’r)+h+1 represents the corresponding offset point. Let c⁒o⁒l1,c⁒o⁒l2,…⁒c⁒o⁒ldΓ—k⁒(dβˆ’r+1) be the columns of an element π•π‚βˆˆW~. The hth affine subspace is represented by the linear subspace spanned by the columns c⁒o⁒lh⁒(dβˆ’r)+1 to c⁒o⁒l(h+1)⁒(dβˆ’r)+1 and the point corresponding to column c⁒o⁒lk⁒(dβˆ’r)+h+1. We again denote the corresponding submatrix by 𝐁𝐑. The square of the distance from a point π²βˆˆβ„d to the orthogonal complement of an (dβˆ’r)-dimensional affine subspace defined by 𝐩+𝐕𝐂⁒x is given by βˆ‘i=1dβˆ’r(vi⁒(π²βˆ’π©))2=‖𝐕𝐂⁒(π²βˆ’π©)β€–2. Now we define the family 𝒫~={P~i,j,f}1≀i<j≀k,1≀f≀n of polynomials over W as

P~i,j,f⁒(𝐕𝐂)=‖𝐁𝐒⁒(𝐱fβˆ’c⁒o⁒lk⁒(dβˆ’r)+i+1)β€–F2βˆ’β€–ππ’β’(𝐱fβˆ’c⁒o⁒lk⁒(dβˆ’r)+j+1)β€–F2.

Note that P~i,j,f⁒(𝐕𝐂)>0 corresponds to the case that the jth r-dimensional affine subspace represented by (𝐕𝐂)𝐂=𝐕 is closer to 𝐱f than the ith r-dimensional affine subspace represented by 𝐕. By the same argument as above, we can iterate over all non-empty cells of sign conditions and for each cell, we compute a solution by solving k instances of PCA. The degree of Q~ and polynomials from 𝒫~ is at most 4, |𝒫~|=n⁒k2 and the real dimension of W is at most d⁒k⁒(dβˆ’r+1) which is the dimension of ℝdΓ—k⁒(dβˆ’r+1)βŠ‡W. The algorithm from Proposition 6 does at most t=(n⁒k2)d⁒k⁒(dβˆ’r+1)⁒2π’ͺ⁒(d⁒k⁒(r+1)) operations and produces at most t points. Our algorithm runs k instances of PCA for each such point and hence we have reduced our problem to k⁒(n⁒k2)d⁒k⁒(dβˆ’r+1)⁒2π’ͺ⁒(d⁒k⁒(r+1))=nπ’ͺ⁒(d⁒k⁒(dβˆ’r+1)) instances of PCA. The overall running time in this case is nπ’ͺ⁒(d⁒k⁒(dβˆ’r+1)). As we can choose the most efficient way to represent the affine subspace the running time of our algorithm is in nπ’ͺ⁒(m⁒i⁒n⁒{d⁒k⁒(r+1),d⁒k⁒(dβˆ’r+1)}). β—€

6 Conclusion

In this work, we showed that Line Clustering is W[1]-hard when parameterized by k and that Projective Clustering is in XP when parameterized by both k and d. Complementing this result, we also show that the special case Hyperplane Cover is W[2]-hard when parameterized by k alone (and it is known to be NP-hard for d=2).

We conclude with several open questions. Our first question concerns the FPT-approximability of Projective Clustering. To the best of our knowledge, the existence of approximation algorithms with approximation ratio f⁒(d,k) and running time g⁒(d,k)β‹…nπ’ͺ⁒(1) is open for any computable functions f and g only depending on k and d. The parameterization by k and d also remains open. For the case when r and k are constants, Deshpande et al. [16] gave a polynomial-time approximation scheme (PTAS) for Projective Clustering.

Another interesting open question concerns the existence of an exact XP algorithm for Projective Clustering parameterized by d and k for β„“p-norms with pβ‰ 2. Already the base case with a single subspace differs with the case of p=2. Clarkson and Woodruff [13] show that for every p∈[1,2), the problem of finding an r-dimensional subspace F that minimizes (or even (1+1/dπ’ͺ⁒(1))-approximates) βˆ‘i=1ndist⁒(𝐱i,F)p is NP-hard. For p>2, Deshpande et al. [17] show NP-hardness and UGC-hardness.

Finally by Theorem 2, Hyperplane Cover is W⁑[2]-hard when parameterized by k. We do not know whether the problem is in XP when parameterized by k, that is, whether it admits a polynomial-time algorithm for every constant k.

References

  • [1] Peyman Afshani, Edvin Berglin, Ingo van Duijn, and Jesper Sindahl Nielsen. Applications of incidence bounds in point covering problems. In 32nd International Symposium on Computational Geometry (SoCG), pages 60:1–60:15. Schloss Dagstuhl – Leibniz-Zentrum fΓΌr Informatik, 2016. doi:10.4230/LIPIcs.SOCG.2016.60.
  • [2] Pankaj K. Agarwal, Sariel Har-Peled, and Kasturi R. Varadarajan. Approximating extent measures of points. J. ACM, 51(4):606–635, 2004. doi:10.1145/1008731.1008736.
  • [3] Charu C. Aggarwal, Cecilia Magdalena Procopiuc, Joel L. Wolf, Philip S. Yu, and Jong Soo Park. Fast algorithms for projected clustering. In Proceedings ACM SIGMOD International Conference on Management of Data, pages 61–72. ACM Press, 1999. doi:10.1145/304182.304188.
  • [4] V. S. Anil Kumar, Sunil Arya, and H. Ramesh. Hardness of set cover with intersection 1. In 27th International Colloquium on Automata, Languages and Programming (ICALP), pages 624–635. Springer, 2000. doi:10.1007/3-540-45022-X_53.
  • [5] Saugata Basu, Richard Pollack, and Marie-FranΓ§oise Roy. Algorithms in Real Algebraic Geometry. Springer-Verlag, 2006.
  • [6] Avrim Blum, John Hopcroft, and Ravindran Kannan. Foundations of Data Science. Cambridge University Press, 2020.
  • [7] Leon Bobrowski. K-lines clustering with convex and piecewise linear (cpl) functions. IFAC Proceedings Volumes, 45(2):108–111, 2012.
  • [8] Jean-Daniel Boissonnat, Kunal Dutta, Arijit Ghosh, and Sudeshna Kolay. Tight kernels for covering and hitting: Point hyperplane cover and polynomial point hitting set. In Proceedings of the 13th Latin American Symposium on Theoretical Informatics (LATIN), pages 187–200. Springer, 2018. doi:10.1007/978-3-319-77404-6_15.
  • [9] BjΓΆrn BrodΓ©n, Mikael Hammar, and Bengt J. Nilsson. Guarding lines and 2-link polygons is APX-hard. In Proceedings of the 13th Canadian Conference on Computational Geometry (CCCG), pages 45–48, 2001. URL: http://www.cccg.ca/proceedings/2001/mikael-2351.ps.gz.
  • [10] Kaushik Chakrabarti and Sharad Mehrotra. Local dimensionality reduction: A new approach to indexing high dimensional spaces. In VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, pages 89–100. Morgan Kaufmann, 2000. URL: http://www.vldb.org/conf/2000/P089.pdf.
  • [11] Jianer Chen, Xiuzhen Huang, Iyad A. Kanj, and Ge Xia. Strong computational lower bounds via parameterized complexity. Journal of Computer and System Sciences, 72(8):1346–1367, 2006. doi:10.1016/J.JCSS.2006.04.007.
  • [12] Yam Ki Cheung and Ovidiu Daescu. Line facility location in weighted regions. J. Comb. Optim., 22(1):52–70, 2011. doi:10.1007/S10878-009-9272-3.
  • [13] Kenneth L. Clarkson and David P. Woodruff. Input sparsity and hardness for robust subspace approximation. In IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 310–329. IEEE Computer Society, 2015. doi:10.1109/FOCS.2015.27.
  • [14] Vincent Cohen-Addad, Arnaud De Mesmay, Eva Rotenberg, and Amir Roytman. The bane of low-dimensionality clustering. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 441–456. SIAM, 2018. doi:10.1137/1.9781611975031.30.
  • [15] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, DΓ‘niel Marx, Marcin Pilipczuk, MichaΕ‚ Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. doi:10.1007/978-3-319-21275-3.
  • [16] Amit Deshpande, Luis Rademacher, Santosh S. Vempala, and Grant Wang. Matrix approximation and projective clustering via volume sampling. Theory Comput., 2(12):225–247, 2006. doi:10.4086/TOC.2006.V002A012.
  • [17] Amit Deshpande, Madhur Tulsiani, and Nisheeth K. Vishnoi. Algorithms and hardness for subspace approximation. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 482–496. SIAM, 2011. doi:10.1137/1.9781611973082.39.
  • [18] Rodney G. Downey and Michael R. Fellows. Parameterized complexity. Springer-Verlag, New York, 1999.
  • [19] Carl Eckart and G. Marion Young. The approximation of one matrix by another of lower rank. Psychometrika, 1:211–218, 1936.
  • [20] Ehsan Elhamifar and RenΓ© Vidal. Sparse subspace clustering: Algorithm, theory, and applications. IEEE transactions on pattern analysis and machine intelligence, 35(11):2765–2781, 2013. doi:10.1109/TPAMI.2013.57.
  • [21] Dan Feldman and Michael Langberg. A unified framework for approximating and clustering data. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing (STOC), pages 569–578. ACM, 2011. doi:10.1145/1993636.1993712.
  • [22] Fedor V. Fomin, Daniel Lokshtanov, Saket Saurabh, and Meirav Zehavi. Kernelization: theory of parameterized preprocessing. Cambridge University Press, 2019.
  • [23] Sariel Har-Peled and Yusu Wang. Shape fitting with outliers. SIAM J. Comput., 33(2):269–285, 2004. doi:10.1137/S0097539703427963.
  • [24] J. Ho, M. H. Yang, J. Lim, K. C. Lee, and D. Kriegman. Clustering appearances of objects under varying illumination conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 11–18. IEEE, 2003.
  • [25] W. Hong, J. Wright, K. Huang, and Y. Ma. Multi-scale hybrid linear models for lossy image representation. IEEE Transactions on Image Processing, 15(12):3655–3671, 2006. doi:10.1109/TIP.2006.882016.
  • [26] Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity. J. Computer and System Sciences, 63(4):512–530, 2001. doi:10.1006/JCSS.2001.1774.
  • [27] Mary Inaba, Naoki Katoh, and Hiroshi Imai. Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In Proceedings of the 10th annual Symposium on Computational Geometry (SoCG), pages 332–339. ACM, 1994.
  • [28] Nikolai M. Korneenko and Horst Martini. Hyperplane approximation and related topics. In New trends in discrete and computational geometry, pages 135–161. Springer, 1993.
  • [29] Stefan Kratsch, Geevarghese Philip, and Saurabh Ray. Point line cover: The easy kernel is essentially tight. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1596–1606. SIAM, 2014. doi:10.1137/1.9781611973402.116.
  • [30] Michael Langberg and Leonard J. Schulman. Universal epsilon-approximators for integrals. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 598–607. SIAM, 2010. doi:10.1137/1.9781611973075.50.
  • [31] Stefan Langerman and Pat Morin. Covering things with things. Discrete & Computational Geometry, 33:717–729, 2005. doi:10.1007/S00454-004-1108-4.
  • [32] Stuart P. Lloyd. Least squares quantization in PCM. IEEE Trans. Inf. Theory, 28(2):129–136, 1982. doi:10.1109/TIT.1982.1056489.
  • [33] Nimrod Megiddo and Arie Tamir. On the complexity of locating linear facilities in the plane. Operations Research Letters, 1(5):194–197, 1982. doi:10.1016/0167-6377(82)90039-6.
  • [34] Rolf Niedermeier. Invitation to fixed-parameter algorithms. Oxford University Press, 2006.
  • [35] Halil OruΓ§ and George M. Phillips. Explicit factorization of the vandermonde matrix. Linear Algebra and its Applications, 315(1-3):113–123, 2000.
  • [36] Lance Parsons, Ehtesham Haque, and Huan Liu. Subspace clustering for high dimensional data: a review. ACM SIGKDD explorations newsletter, 6(1):90–105, 2004. doi:10.1145/1007730.1007731.
  • [37] Cecilia M. Procopiuc. Projective clustering. In Encyclopedia of Machine Learning, pages 806–811. Springer, 2010. doi:10.1007/978-0-387-30164-8_676.
  • [38] Cecilia Magdalena Procopiuc, Michael Jones, Pankaj K. Agarwal, and T. M. Murali. A monte carlo algorithm for fast projective clustering. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pages 418–427. ACM, 2002. doi:10.1145/564691.564739.
  • [39] Nariankadu D. Shyamalkumar and Kasturi R. Varadarajan. Efficient subspace approximation algorithms. Discret. Comput. Geom., 47(1):44–63, 2012. doi:10.1007/S00454-011-9384-2.
  • [40] Lloyd N. Trefethen and David Bau III. Numerical Linear Algebra. Society for Industrial and Applied Mathematics, 1997.
  • [41] Kasturi R. Varadarajan and Xin Xiao. On the sensitivity of shape fitting problems. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS, LIPIcs, pages 486–497. Schloss Dagstuhl – Leibniz-Zentrum fΓΌr Informatik, 2012. doi:10.4230/LIPIcs.FSTTCS.2012.486.
  • [42] R. Vidal, S. Soatto, Y. Ma, and S. Sastry. An algebraic geometric approach to the identification of a class of linear hybrid systems. In Proceedings of the Conference on Decision and Control (CDC), pages 167–172. IEEE, 2003. doi:10.1109/CDC.2003.1272554.
  • [43] R. Vidal, R. Tron, and R. Hartley. Multiframe motion segmentation with missing data using power factorization and gpca. International Journal of Computer Vision, 79(1):85–105, 2008. doi:10.1007/S11263-007-0099-Z.
  • [44] RenΓ© Vidal. Subspace clustering. IEEE Signal Processing Magazine, 28(2):52–68, 2011. doi:10.1109/MSP.2010.939739.
  • [45] J. Wang, W. Li, and J. Chen. A parameterized algorithm for the hyperplane-cover problem. Theoretical Computer Science, 411(44–46):4005–4009, 2010. doi:10.1016/J.TCS.2010.08.012.
  • [46] Yue Wang, Wenqing Li, Esha Sarkar, Muhammad Shafique, Michail Maniatakos, and Saif Eddin Jabari. A Subspace Projective Clustering Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks . IEEE Transactions on Artificial Intelligence, 5(07):3497–3509, 2024. doi:10.1109/TAI.2024.3373720.
  • [47] A. Yang, J. Wright, Y. Ma, and S. Sastry. Unsupervised segmentation of natural images via lossy data compression. Computer Vision and Image Understanding, 110(2):212–225, 2008. doi:10.1016/J.CVIU.2007.07.005.