On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic

Given a set of $n$ points in $\mathbb R^d$, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the $\ell_p$-metric. Closest Pair is a fundamental problem in Computational Geometry and understanding its fine-grained complexity in the Euclidean metric when $d=\omega(\log n)$ was raised as an open question in recent works (Abboud-Rubinstein-Williams [FOCS'17], Williams [SODA'18], David-Karthik-Laekhanukit [SoCG'18]). In this paper, we show that for every $p\in\mathbb R_{\ge 1}\cup\{0\}$, under the Strong Exponential Time Hypothesis (SETH), for every $\varepsilon>0$, the following holds: $\bullet$ No algorithm running in time $O(n^{2-\varepsilon})$ can solve the Closest Pair problem in $d=(\log n)^{\Omega_{\varepsilon}(1)}$ dimensions in the $\ell_p$-metric. $\bullet$ There exists $\delta = \delta(\varepsilon)>0$ and $c = c(\varepsilon)\ge 1$ such that no algorithm running in time $O(n^{1.5-\varepsilon})$ can approximate Closest Pair problem to a factor of $(1+\delta)$ in $d\ge c\log n$ dimensions in the $\ell_p$-metric. At the heart of all our proofs is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on $n$ vertices with $n^{2-\varepsilon}$ edges whose vertices can be realized as points in a $(\log n)^{\Omega_\varepsilon(1)}$-dimensional Euclidean space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory'03].

• No algorithm running in time O(n 2−ε ) can solve the Closest Pair problem in d = (log n) Ω ε (1) dimensions in the ℓ p -metric. • There exists δ = δ(ε) > 0 and c = c(ε) ≥ 1 such that no algorithm running in time O(n 1.5−ε ) can approximate Closest Pair problem to a factor of (1 + δ) in d ≥ c log n dimensions in the ℓ p -metric.
In particular, our first result is shown by establishing the computational equivalence of the bichromatic Closest Pair problem and the (monochromatic) Closest Pair problem (up to n ε factor in the running time) for d = (log n) Ω ε (1) dimensions.
Additionally, under SETH, we rule out nearly-polynomial factor approximation algorithms running in subquadratic time for the (monochromatic) Maximum Inner Product problem where we are given a set of n points in n o(1) -dimensional Euclidean space and are required to find a pair of distinct points in the set that maximize the inner product.
At the heart of all our proofs is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on n vertices with n 2−ε edges whose vertices can be realized as points in a (log n) Ω ε (1) -dimensional Euclidean space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory'03].

Introduction
The Closest Pair of Points problem or Closest Pair problem (CP) is a fundamental problem in computational geometry: given n points in a d-dimensional metric space, find a pair of distinct points with the smallest distance between them. The Closest Pair problem for points in the Euclidean plane [SH75,BS76] stands at the origins of the systematic study of the computational complexity of geometric problems [PS85,Man89,KT05,CLRS09]. Since then, this problem has found abundant applications in geographic information systems [Hen06], clustering [Zah71,Alp10], and numerous matching problems (such as stable marriage [WTFX07]).
The trivial algorithm for CP examines every pair of points in the point-set and runs in time O(n 2 d). Over the decades, there have been a series of developments on CP in low dimensional space for the Euclidean metric [Ben80,HNS88,KM95,SH75,BS76], leading to a deterministic O (2 O(d) n log n)-time algorithm [BS76] and a randomized O (2 O(d) n)time algorithm [Rab76,KM95]. For low (i.e., constant) dimensions, these algorithms are tight as a matching lower bound of Ω(n log n) was shown by Ben-Or [Ben83] and Yao [Yao91] in the algebraic decision tree model, thus settling the complexity of CP in low dimensions. On other hand, for very high dimensions (i.e., d = Ω(n)) there are subcubic algorithms [GS16,ILLP04] in the ℓ 1 , ℓ 2 , and ℓ ∞ -metrics using fast matrix multiplication algorithms [Gal14]. However, CP in medium dimensions, i.e., d = polylog(n), and in various ℓ p -metrics, have been a focus of study in machine learning and analysis of Big Data [Kle97], and it is surprising that, even with the tools and techniques that have been developed over many decades, when d = ω(log n), there is no known subquadratictime (i.e., O(2 o(d) n 2−ε )-time) algorithm, for CP in any standard distance measure [Ind00, AC09,ILLP04] . The absence of such algorithms was explicitly observed as early as the late nineties by Cohen and Lewis [CL99] but there was not any explanation until recently.
David, Karthik, and Laekhanukit [DKL18] showed that for all p > 2, assuming the Strong Exponential Time Hypothesis (SETH), for every ε > 0, no algorithm running in n 2−ε time can solve CP in the ℓ p -metric, even when d = ω(log n). Their conditional lower bound was based on the conditional lower bound (again assuming SETH) of Alman and Williams [AW15] for the Bichromatic Closest Pair problem 1 (BCP) where we are given two sets of n points in a d-dimensional metric space, and the goal is to find a pair of points, one from each set, with the smallest distance between them. Alman and Williams showed that for all p ∈ R ≥1 ∪ {0}, assuming SETH, for every ε > 0, no algorithm running in n 2−ε time can solve BCP in the ω(log n)-dimensional ℓ p -metric space. Given that [AW15] show their lower bound on BCP for all ℓ p -metrics, the lower bound on CP of [DKL18] feels unsatisfactory, since the ℓ 2 -metric is arguably the most interesting metric to study CP on. On the other hand, the answer to the complexity of CP in the Euclidean metric might be on the positive side, i.e., there might exist an algorithm that performs well in the ℓ 2 -metric because there are more tools available, e.g., Johnson-Lindenstrauss dimension reduction [JL84]. Thus we have the following question: Open Question 1.1 (Abboud-Rubinstein-Williams 2 [ARW17b], Williams [Wil18a], David -Karthik-Laekhanukit [DKL18]). Is there an algorithm running in time n 2−ε for some ε > 0 1 We remark that BCP is of independent interest as it's equivalent to finding the Minimum Spanning Tree in ℓ p -metric [AESW91,KLN99]. Moreover, understanding the fine-grained complexity of BCP has lead to better understanding of the query time needed for Approximate Nearest Neighbor search problem (see Razenshteyn's thesis [Raz17] for a survey about the problem) with polynomial preprocessing time [Rub18]. 2 Please see the erratum in [ARW17a].

which can solve CP in the Euclidean metric when the points are in ω(log n) dimensions?
Even if the answer to the above question is negative, this does not rule out strong approximation algorithms for CP in the Euclidean metric, which might suffice for all applications. Indeed, we do know of subquadratic approximation algorithms for CP. For example, LSH based techniques can solve (1 + δ)-CP (i.e., (1 + δ) factor approximate CP) in n 2−Θ(δ) time [IM98], but cannot do much better [MNP07,OWZ14]. In a recent breakthrough, Valiant [Val15] obtained an approximation algorithm for (1 + δ)-CP with runtime of n 2−Θ( √ δ) . The state of the art is an n 2− Θ(δ 1/3 ) -time algorithm by Alman, Chan, and Williams [ACW16]. Can the dependence on δ be improved indefinitely? For the case of (1 + δ)-BCP, assuming SETH, Rubinstein [Rub18] answered the question in the negative. Does (1 + δ)-CP also admit the same negative answer?
Open Question 1.2. Is there an algorithm running in time n 2−ε for some ε > 0 which can solve (1 + δ)-CP in the Euclidean metric when the points are in ω(log n) dimensions for every δ > 0?
Another important geometric problem is the Maximum Inner Product problem (MIP): given n points in the d-dimensional Euclidean space, find a pair of distinct points with the largest inner product. This problem along with its bichromatic variant (Bichromatic Maximum Inner Product problem, denoted BMIP) is extensively studied in literature (see [ARW17b] and references therein). Abboud, Rubinstein, and Williams [ARW17b] showed that assuming SETH, for every ε > 0, no 2 (log n) 1−o(1) -approximation algorithm running in n 2−ε time can solve BMIP when d = n o(1) . It is a natural question to ask if their inapproximability result can be extended to MIP: Open Question 1.3. Is there an algorithm running in time n 2−ε for some ε > 0 which can solve γ-MIP in n o(1) dimensions for even γ = 2 (log n) 1−o(1) ?

Our Results
In this paper we address all three previously mentioned open questions. First, we almost completely resolve Open Question 1.1. In particular, we show the following. Theorem 1.4 (Subquadratic Hardness of CP; Informal, See Theorem 4.3). Let p ∈ R ≥1 ∪ {0}. Assuming SETH, for every ε > 0, no algorithm running in n 2−ε time can solve CP in the ℓ p -metric, even when d = (log n) Ω ε (1) .
In particular we would like to emphasize that the dimension for which we show the lower bound on CP depends on ε. We would also like to remark that our lower bound holds even when the input point-set of CP is a subset of {0, 1} d . Finally, we note that the centerpiece of the proof of the above theorem (and also the proofs of the other results that will be subsequently mentioned) is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on n vertices with n 2−ε edges whose vertices can be realized as points in a (log n) Ω ε (1) -dimensional ℓ p -metric space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer, Miccancio, and Sudan [DMS03] and uses special density properties of Reed Solomon codes. A detailed proof overview is given in Section 2.1.
Next, we improve our result in Theorem 1.4 in some aspects by showing 1 + o(1) factor inapproximability of CP even in O ε (log n) dimensions, but can only rule out algorithms running in n 1.5−ε time (as opposed to Theorem 1.4 which rules out exact algorithms for CP running in n 2−ε time). More precisely, we show the following. Theorem 1.5 (Subquadratic Hardness of gap-CP). Let p ∈ R ≥1 ∪ {0}. Assuming SETH, for every ε > 0, there exists δ(ε) > 0 and c(ε) > 1 such that no algorithm running in n 1.5−ε time that can solve (1 + δ)-CP in the ℓ p -metric, even when d = c log n.
We remark that the n 1.5−ε lower bound on approximate CP is an artifact of our proof strategy and that a different approach or an improvement in the state-of-the-art bound on the number of minimum weight codewords in algebraic geometric codes (which are used in our proof), will lead to the complete resolution of Open Question 1.2.
It should also be noted that the approximate version of CP and the dimension are closely related. Namely, using standard dimensionality reduction techniques [JL84] 3 for (1 + δ)-CP, one can always assume that d = O δ (log n). In other words, hardness of (1 + δ)-CP immediately yields logarithmic dimensionality bound as a byproduct.
Finally, we completely answer Open Question 1.3 by showing the following inapproximability result for MIP, matching the hardness for BMIP from [ARW17b].
Recently, there have been a lot of results connecting BCP or (1 + o(1))-BCP to other problems (see [Rub18,Che18a,Che18b,CW19]). Now such connections can be extended to CP as well. For example, the following conditional lower bound follows from [Rub18] for gap-CP in the edit distance metric and for completeness a proof is given in Appendix A.
Theorem 1.7 (Subquadratic Hardness of gap-CP in edit distance metric). Assuming SETH, for every ε > 0, there exists δ(ε) > 0 and c(ε) > 1 such that no algorithm running in n 1.5−ε time can solve (1 + δ)-CP in the edit distance metric, even when d = c log n log log n.

Proof Overview
In this section, we provide an overview of our proofs. For ease of presentation, we will sometimes be informal here; all notions and proofs are formalized in subsequent sections. Our overview is organized as follows. First, in Subsection 2.1, we outline our proof of running time lower bounds for exact CP (Theorem 1.4). Then, in Subsection 2.2, we abstract part of our reduction using error-correcting codes, and relate them back to the works on locally dense codes [DMS03,CW12,Mic14] that inspire our constructions. Finally, in Subsection 2.3, we briefly discuss how to modify the base construction (i.e. code properties) to give conditional lower bounds for approximate CP and MIP (Theorems 1.5 and 1.6).

Conditional Lower Bound on Exact Closest Pair
In this subsection, we provide a proof overview of a slightly weaker version of Theorem 1.4, i.e., we show that assuming SETH, for every p ∈ R ≥1 ∪ {0}, no subquadratic time algorithm can solve CP in the ℓ p -metric when d = (log n) ω(1) . We prove such a result by reducing BCP in dimension d to CP in dimension d + (log n) ω(1) , and the subquadratic hardness for CP follows from the subquadratic hardness of BCP established by [AW15]. Note that the results in this paper remain interesting even if SETH is false, as our reduction shows that BCP and CP are computationally equivalent 4 (up to n o(1) factor in the running time) when d = (log n) ω(1) . The conditional lower bound on CP is merely a consequence of this computational equivalence. Finally, we note that a similar equivalence also holds between MIP and BMIP.
Understanding an obstacle of [DKL18]. Our proof builds on the ideas of [DKL18] who showed that assuming SETH, for every p > 2, no subquadratic time algorithm can solve CP in the ℓ p -metric when d = ω(log n). They did so by connecting the complexity of CP and BCP via the contact dimension of the balanced complete bipartite graph (biclique), denoted by K n,n . We elaborate on this below.
To motivate the idea behind [DKL18], let us first consider the trivial reduction from BCP to CP: given an instance A, B of BCP, we simply output A ∪ B as an instance of CP. This reduction fails because there is no guarantee on the distances of a pair of points both in A (or both in B). That is, there could be two points a, a ′ ∈ A such that a − a ′ p is much smaller than the optimum of BCP on A, B. If we simply solve CP on A ∪ B, we might find such a, a ′ as the optimal pair but this does not give the answer to the original BCP problem. In order to circumvent this issue, one needs a gadget that "stretch" pairs of points both in A or both in B further apart while keeping the pairs of points across A and B close (and preserving the optimum of BCP on A, B). It turns out that this notion corresponds exactly to the contact dimension of the biclique, which we define below. [Pac80]). For any graph G = (V, E), a mapping τ : V → R d is said to realize G (in the ℓ p -metric) if for some β > 0, the following holds for every distinct vertices u, v:

Definition 2.1 (Contact Dimension
The contact dimension (in the ℓ p -metric) of G, denoted by cd p (G), is the minimum d ∈ N such that there exists τ : V → R d realizing G in the ℓ p -metric.
In this paper, we will be mainly interested in the contact dimension of bipartite graphs. Specifically, [DKL18] only consider the contact dimension of the biclique K n,n . Notice that a realization of biclique ensures that vertices on the same side are far from each other while vertices on different sides are close to each other preserving the optimum of BCP; these are exactly the desired properties of a gadget outlined above. Using this, [DKL18] give a reduction from BCP to CP which shows that the two are computationally equivalent whenever d = Ω(cd p (K n,n )), as follows.
Let A, B ⊆ R d each of cardinality n be an instance of BCP and let τ : A∪B → R cd p (K n,n ) be a map realizing the biclique (A∪B, A × B) in the ℓ p -metric; we may assume w.l.o.g. that β = 1. Let δ be the distance between any point in A and any point in B (i.e., δ is an upper bound on the optimum of BCP). Let ρ > 0 be such that τ(a) − τ(b) p > 1 + ρ for all a ∈ A, b ∈ B (and this is guaranteed to exist by (2)). Moreover, let k > δ/ρ be any sufficiently large number. Consider the point-sets A, B ⊆ R d+cd p (K n,n ) of cardinality n each defined as where • denotes the concatenation between two vectors and k · x denotes the usual scalar-vector multiplication (i.e. scaling x up by a factor of k). For brevity, we write a and b to denote a • (k · τ(a)) and b • (k · τ(b)) respectively.
We now argue that, if we can find the closest pair of points in A ∪ B, then we also immediately solve BCP for (A, B). More precisely, we claim that (a To see that this is the case, observe that, for cross pairs ( a, b) ∈ A × B, (1) implies that the distance a − b p is exactly (k p + a − b p p ) 1/p ; hence, among these pairs, ( a * , b * ) is a closest pair iff (a * , b * ) is a bichromatic closest pair in A, B. Notice also that, since the bichromatic closest pair in A, B is of distance at most δ, the closest pair in A ∪ B is of distance at most (k p On the other hand, for pairs both from A or both from B, the distance must be at least k(1 + ρ), which is more than k + δ from our choice of k. As a result, these pairs cannot be a closest pair in A ∪ B, and this concludes the sketch of the proof.
There are a couple of details that we have glossed over here: one is that the gap ρ cannot be too small (e.g., ρ cannot be as small as 1 /2 n ) and the other is that we should be able to construct τ efficiently. Nevertheless, these are typically not an issue. [DKL18] show that cd p (K n,n ) = Θ(log n) when p > 2 and that the realization can be constructed efficiently and with sufficiently large ρ. This implies the subquadratic hardness of CP (by reduction from BCP) in the ℓ p -metric for all p > 2 and d = ω(log n). However, it was known that cd 2 (K n,n ) = Θ(n) [FM88]. Thus, they could not extend their conditional lower bound to CP in the Euclidean metric 5 even when d = o(n). In fact, this is a serious obstacle as it rules out many natural approaches to reduce BCP to CP in a black-box manner. Elaborating, the lower bound on cd 2 (K n,n ) rules out local gadget reductions which would replace each point with a composition of that point and a gadget with a small increase in the number of dimensions, as such gadgets can be used to construct a realization of K n,n in the Euclidean metric in a low dimensional space, contradicting the lower bound on cd 2 (K n,n ).
Overcoming the Obstacle: Beyond Biclique. We overcome the above obstacle by considering dense bipartite graphs, instead of the biclique. More precisely, we show that there exists a balanced bipartite graph G * = (A * ∪ B * , E * ) on 2n vertices such that |E * | ≥ n 2−o(1) and cd p (G * ) is small (i.e. cd p (G * ) ≤ (log n) ω(1) ). We give a construction of such a graph below but before we do so, let us briefly argue why this suffices to show that BCP and CP are computationally equivalent (up to n o(1) multiplicative overhead in the running time) for dimension d = Ω(cd p (G * )).
Let us consider the same reduction which produces A, B as before, but instead of using a realization of the biclique, we use a realization τ of G * . This reduction is of course incorrect: if (a * , b * ) is not an edge in G * , then τ(a * ) − τ(b * ) p could be large and, thus the corresponding pair of points ( a * , b * ) ∈ A × B, may not be the closest pair. Nevertheless, we are not totally hopeless: if (a * , b * ) is an edge, then we are in good shape and the reduction is correct.
With the above observation in mind, consider picking a random permutation π of A ∪ B such that π(A) = A and π(B) = B and then initiate the above reduction with the map (τ • π) instead of τ. Note that τ • π is simply a realization of an appropriate permutation G ′ of G * (i.e., G ′ is isomorphic to G * ). Due to this, the probability that we are "lucky" and (a * , b * ) is an edge in G ′ is p := |E|/n 2 ; when this is the case, solving CP on the resulting instance would give the correct answer for the original BCP instance. If we repeat this log n/p = n o(1) times, we would find the optimum of the original BCP instance with high probability.
To recap, even when G * is not a biclique, we can still use it to give a reduction from BCP to CP, except that the reduction produces multiple (i.e. O(n 2 /|E * |)) instances of CP. We remark here that the reduction can be derandomized: we can deterministically (and efficiently) pick the permutations so that the permuted graphs covers K n,n (see Lemma 3.11). As a minor digression, we would like to draw a parallel here with a recent work of Abboud, Rubinstein, and Williams [ARW17b]. The obstacle raised in [DKL18] is about the impossibility of certain kinds of many-one gadget reductions. We overcame it by designing a reduction from BCP to CP which not only increased the number of dimensions but also the number of points (by creating multiple instances of CP). This technique is also utilized in [ARW17b] where they showed the impossibility of Deterministic Distributed PCPs (Theorem I.2 in [ARW17b]) but then overcame that obstacle by using an advice (which is then enumerated over resulting in multiple instances) to build Non-deterministic Distributed PCPs.
Constructing a dense bipartite graph with low contact dimension. We now proceed to construct the desired graph G * = (A * ∪ B * , E * ). Note that any construction of a dense bipartite graph with contact dimension n o(1) is non-trivial. This is because it is known that a random graph has contact dimension Ω(n) in the Euclidean metric with high probability [RRS89,BL05], and therefore our graph construction must be significantly better than a random graph.
Our realization τ * of G * will map into a subset of {0, 1} (log n) ω(1) . As a result, we can fix p = 0, since a realization of a graph with entries in {0, 1} in the Hamming-metric also realizes the same graph in every ℓ p -metric for any p = ∞.
Fix g = ω(1). We associate [n] with F h q where q = Θ ((log n) g ) is a prime and h = Θ log n g·log log n . Let P be the set of all univariate polynomials (in x) over F q of degree at most h − 1. We have that |P | = q h = n and associate P with A * . Let Q be the set of all univariate monic polynomials (in x) over F q of degree h, i.e., We associate the polynomials in Q with the vertices in B * (note that |Q| = n). In fact, we view the vertices in A * and B * as being uniquely labeled by polynomials in P and Q respectively. For notational clarity, we write p a (resp. p b ) to denote the polynomial in P (resp. Q) that is associated to a ∈ A * (resp. b ∈ B * ).
For every a ∈ A * and b ∈ B * , we include (a, b) as an edge in E * if and only if the polynomial p b − p a (which is of degree h) has h distinct roots. This completes the construction of G * . We have to now show the following two claims about G * : (i) |E * | = n 2−O( 1 /g) = n 2−o(1) and (ii) there is τ : To show (i), let R be the set of all monic polynomials of degree h with h distinct roots. We have that |R| = ( q h ). Fix a vertex a ∈ A * . Its degree in G * is exactly |R| = ( q h ). This is because, for every polynomial r ∈ R, r + a belongs to Q, and therefore (a, r + a) ∈ E * . This implies the following bound on |E * |: Next, to show (ii), we construct a realization τ * : We define τ * as follows.
• For every a ∈ A * , τ * (a) is simply the vector of evaluation of p a on every element in F q . More precisely, for every We now show that τ * is indeed a realization of G * ; specifically, we show that τ * satisfies (1) and (2) Next, consider a non-edge (a, b) ∈ (A * × B * ) \ E * . Then, we know that p b − p a has at most h − 1 distinct roots over F q . Therefore, the polynomial p b − p a is non-zero on at This completes the proof sketch for both the claims about G * and yields Theorem 1.4 for d = (log n) ω(1) . Finally we remark that in the actual proof of Theorem 1.4, we will set the parameters in the above construction more carefully and achieve the bound cd p (G * ) = (log n) O ε (1) .

Abstracting the Construction via Error-Correcting Codes
Before we move on to discuss the proofs of Theorems 1.6 and 1.5, let us give an abstraction of the construction in the previous subsection. This will allow us to easily generalize the construction for the aforemention theorems, and also to explain where our motivation behind the construction comes from in the first place.

Dense Bipartite Graph with Low Contact Dimension from Codes.
In order to construct a balanced bipartite graph G * on 2n vertices with n 2−o(1) edges such that cd p (G * ) ≤ d * , it suffices to have a code C * with the following properties (for coderelated definitions, see Section 3.2): • C * ⊆ F ℓ q of cardinality n is a linear code with block length ℓ over alphabet F q , and minimum distance ∆.
• There exists a center s * ∈ F ℓ q and r * < ∆ such that |C * | 1−o(1) codewords are at Hamming distance exactly r * from s * and no codeword is at distance less than r * from s * .
We also require that C * and s * can be constructed in poly(n) time but we shall ignore this requirement for the ease of exposition.
We describe below how to construct G * from C * , but first note that the construction of G * we saw in the previous subsubsection was just showing that Reed Solomon codes [RS60] of block length q = Θ((log n) g ) and message length h = Θ log n g·log log n over alphabet F q with minimum distance q − h + 1 has the above properties. The center s * in that construction was the evaluation of the polynomial x h over F q , and r * was q − h.
In general, to construct G * from C * , we first define a subset S * ⊆ F ℓ q of cardinality n as follows: We associate the vertices in A * with the codewords of C * and vertices in B * with the strings in S * . For any (a, b) ∈ A * × B * , let (a, b) ∈ E * if and only if b − a 0 = r * . This completes the construction of G * . We have to now show the following claims about G * : Item (i) follows rather easily from the properties of C * and s * . Let T * be the subset of C * of all codewords which are at distance exactly equal to r * from s * . From the definition of s * , we have |T * | = |C * | 1−o(1) . Fix a ∈ A * . Its degree in G * is |T * | = |C * | 1−o(1) . This is because for every codeword t ∈ T * we have that t − a is a codeword in C * (from the linearity of C * ) and thus s * − t + a is in S * , and therefore (a, s * − t + a) ∈ E * .
For item (ii), consider the identity mapping τ * : A * ∪ B * → F ℓ q that maps each string to itself. It is simple to check that τ * realizes G * in the Hamming metric (with β = r * ).
Recall from the previous subsection that given τ * : A * ∪ B * → F ℓ q that realizes G * in the Hamming metric, it is easy to construct τ : A * ∪ B * → {0, 1} q·ℓ that realizes G * in the Hamming metric with a q multiplicative factor blow-up in the dimension. This completes the proof of both the claims about G * and gives a general way to prove Theorem 1.4 given the construction of C * and s * .
Finding Center from Another Code. One thing that might not be clear so far is: where does the center s * come from? Here we provide a systematic way to produce such an s * , by looking at another code that contains C * . More precisely, let C * ⊆ C * ⊆ F ℓ q be two linear codes with the same block length and alphabet. Suppose that the distance of C * is ∆, the distance of C * is r * and that r * < ∆. It is easy to see that, by taking s * to be any element of C * \ C * , it holds that every codeword in C * is at distance at least r * from s * , simply because both s * and the codewords of C * are codewords of C * .
Hence, we are only left to argue that there are many codewords of C * that is of distance exactly r * from s * . While this is not true in general, we can show by an averaging argument that this is true (for some s * ∈ C * ) if a large fraction (e.g. |C * | −o(1) fraction) of codewords of C * has Hamming weight exactly r * (see Lemma 5.1).
Indeed, viewing in this light, our previous choice of center for Reed-Solomon code (i.e. evaluation of x h ) is not coincidental: we simply take C * to be another Reed-Solomon code with message length h + 1 (whereas the base code C * is of message length h).
Comparison to Locally Dense Codes. We end this subsection by remarking that the codes that we seek are very similar to locally dense codes [DMS03,CW12,Mic14], which is indeed our inspiration. A locally dense code is a linear code of block length ℓ and large minimum distance ∆, admitting a ball centered at s of radius 6 r < ∆ and containing a large (i.e. exp(poly(ℓ))) number of codewords 7 . Such codes are non-trivial to construct and in particular all known constructions of locally dense codes are using codes that beat the Gilbert-Varshamov (GV) bound [Gil52,Var57]; in other words we need to do better than random codes to construct them. This is because (as noted in [DMS03]), for a random code C ⊆ F ℓ q (or any code that does not beat the GV bound), a random point in F ℓ q acting as the center contains in expectation less than one codeword in a ball of radius ∆. Of course, this is simply an intuition and not a formal proof that a locally dense code needs to beat the GV bound, since there may be more sophisticated ways to pick a center.
Although the codes we require are similar to locally dense codes, there are differences between the two. Below we list four such differences: the first two makes it harder for us to construct our codes whereas the latter two makes it easier for us.
• We seek a center s * so that no codewords in C * lies at distance less than r * , as opposed to locally dense codes which allows codewords to be close to s * . This is indeed where our idea of using another code C * ⊇ C * comes in, as picking s * from C * \ C * ensures us that no codeword of C * is too close to s * .
• Another difference is that we need the number of codewords at distance r * from s * to be very large, i.e., |C * | 1−o(1) , whereas locally dense codes allow for much smaller number of codewords. Indeed, the deterministic constructions from [CW12,Mic14] only yield the bound of 2 O( √ log |C * |) . Hence, these do not directly work for us.
• Locally dense codes requires r to be at most (1 − ε)∆ for some constant ε > 0, whereas we are fine with any r * < ∆. In fact, our Reed-Solomon code based construction above only yields r * = ∆ − 1 which would not suffice for locally dense codes. Nevertheless, as we will see later for inapproximability of CP, we will also need the ratio r * /∆ to be a constant bounded away from 1 as well and, since we need a code with these extraordinary properties, they are very hard to find. Indeed, in this case we only manage to prove a weaker lower bound on gap-CP.
• Finally, we remark that locally dense codes are required to be efficiently constructed in poly(log |C * |) time, which is part of why it is hard to find. Specifically, while [DMS03] shows that an averaging argument works for a random center, derandomizing this is a big issue and a few subsequent works are dedicated solely to this issue [CW12,Mic14]. (We also note that it remains open whether a center can be deterministically found for a variant of locally dense codes used in hardness of parameterized version of the minimum distance problem. See [BGKM18] for more details.) On the other hand, brute force search (over all codewords in C * ) suffices to find a center for us, as we are allowed construction time of poly(|C * |).

Inapproximability of Closest Pair and Maximum Inner Product
In this subsection, we sketch our inapproximability results for MIP and CP. Both these results use the same reduction that we had from BCP to CP, except that we now need stronger properties from the gadget, i.e., the previously used notions of contact dimension does not suffice anymore. Below we sketch the required strengthening of the gadget properties and explain how to achieve them.

Approximate Maximum Inner Product
Observe that the gadget we construct for CP in Subsection 2.2 can also be written in terms of inner product as follows: there exists a dense balanced bipartite graph G * = (A * ∪ B * , E * ), a mapping τ : A * ∪ B * → {0, 1} q·ℓ such that the following holds.
Notice that we wrote the conditions above in a slightly different way than in previous subsections; previously in the contact dimension notation, (ii) and (iii) would be simply written together as: for all non-edge (a, b), τ(a), τ(b) < ℓ − r * . This change is intentional, since, to get gap in our reductions, we only need a gap between the bounds in (i) and (iii) (but not in (ii)). In particular, to get hardness of approximating MIP, we require ℓ−r * ℓ−∆ to be at least (1 + ε) for some ε > 0. From our Reed-Solomon construction above, ℓ − ∆ and ℓ − r * are exactly the message length of C * minus one and the message length of C * minus one respectively. Previously, we selected these two to be h and h + 1. Now to obtain the desired gap, we simply take the larger code C * to be a Reed-Solomon code with larger (i.e. (1 + ε)h) message length 8 .
Finally, we note that even with the above gadget, the reduction only gives a small (i.e. 1 + o(1)) factor hardness of approximating MIP (Theorem 6.2). To boost the gap to near polynomial, we simply tensor the vectors with themselves (see Section 6).

Approximate Closest Pair
Once again, recall that we have the following gadget from Subsection 2.2: there exists a dense balanced bipartite graph G * = (A * ∪ B * , E * ), a mapping τ : A * ∪ B * → {0, 1} q·ℓ such that the following holds.
Once again, we need an (1 + ε) gap between the bounds in (iii) and (i), i.e., ∆ r * . Unfortunately, we cannot construct such codes using any of the Reed-Solomon code families. We turn to another type of codes that beat the Gilbert-Varshamov bound: Algebraic-Geometric (AG) codes. Similar to the Reed-Solomon code based construction, we take C * as an AG code and C * to be a "higher degree" AG code; getting the desired gap simply means that the distance of C * must be at least (1 + ε) times the distance of C * .
Recall from Subsection 2.2 also that, to bound the density of G * , we need a lower bound on the number of minimum weight codewords of C * . Such bounds for AG codes are non-trivial and we turn to the bounds from [ABV01,Vlȃ18]. Unfortunately, this only gives G * with density |C * | −1/2−o(1) , instead of |C * | −o(1) as before. This is indeed the reason that our running time lower bound for approximate CP is only n 1.5−ε .
We are not aware of any result on the (asymptotic) tightness of the bounds from [ABV01, Vlȃ18] that we use. However, improving upon such bounds would have other consequences, such as a better bound on the kissing numbers of lattices constructed in [Vlȃ18]. As a result, it seems likely that more understanding of AG codes (and perhaps even new constructions) are needed in order to improve these bounds.

Preliminaries
In this section we define the geometric problems of interest to this paper, give an alternate proof for the conditional lower bound on bichromatic closest pair, and recall the definition of the contact dimension of a graph.

Notations, Problems and Fine-Grained Hypotheses
Distance Measures. For any two vectors a, b ∈ R d , the distance between them in the , the number of coordinates on which a and b differ. More generally, for any two vectors a, b ∈ R d in the ∆-metric, we denote by ∆(a, b) its distance in that metric space. The ℓ p -metrics that are well studied in literature are the Hamming metric (ℓ 0 -metric), the rectilinear metric (ℓ 1 -metric), the Euclidean metric (ℓ 2 -metric), and the Chebyshev metric (ℓ ∞ -metric). We denote the inner product (associated with the Euclidean space) of a and b by a, b = ∑ i∈ [d] a i · b i . Finally, for every positive integer d we define the edit metric over Σ to be the space Σ d endowed with distance function ed(a, b), which is defined as the minimum number of character substitutions/insertions/deletions to transform a into b.

Problems.
Here we give formal definitions of Orthogonal Vectors (OV), Closest Pair (CP) and Bichromatic Closest Pair (BCP) problems, and also Maximum Inner Product (MIP) and Bichromatic Maximum Inner Product (BMIP) problems.
We will also use gap versions of these problems. For any δ ≥ 0, we define (1 + δ)-CP (resp. (1 + δ)-BCP) in the ∆-metric to be the problem of distinguishing between the case whether there exist distinct a, b ∈ P (resp. a ∈ A and b ∈ B) such that ∆(a, b) ≤ α and the case where for all distinct a, b ∈ P (resp. a ∈ A and b ∈ B) we have ∆(a, b) > (1 + δ) · α.
Definition 3.4 (Maximum Inner Product Problem, MIP). In MIP, we are given a collection of n points P ⊆ R d and a real α, and the goal is to find a pair of distinct points a, b ∈ P such that a, b ≥ α.
Definition 3.5 (Bichromatic Maximum Inner Product Problem, BMIP). In BMIP, we are given two collections of n points A, B ⊆ R d and a real α, and the goal is to find a pair of points a ∈ A, b ∈ B such that a, b ≥ α.
Again we define the gap versions of these problems as follows. For any γ ≥ 1, we define γ-MIP (resp. γ-BMIP) to be the problem of distinguishing between the case whether there exist distinct a, b ∈ P (resp. a ∈ A and b ∈ B) such that a, b ≥ α and the case where for all distinct a, b ∈ P (resp. a ∈ A and b ∈ B) we have a, b < α /γ.
Hypotheses. Finally, we give formal definitions of the relevant fine-grained hypotheses (see [Wil18b] for a survey on the state-of-the-art conditional lower bounds that are known under these hypotheses).  It is known that SETH implies OVH [Wil05], and therefore in the rest of the paper, we base all our conditional lower bounds on OVH.

Error-Correcting Codes
We recall here a few coding theoretic notations since all of our gadgets are based on errorcorrecting codes. As is standard in error-correcting codes, we will use ∆(a, b) to denote a − b 0 , the Hamming distance of a and b, for any a, b ∈ F N q and we further define ∆(a, S) := min b∈S ∆(a, b) for any a ∈ F N q and S ⊆ F N q . The weight of a ∈ F N q , denoted by ∆(a), is simply a 0 := |i ∈ [N] : a i = 0|. For a ∈ F N q and d ∈ N, we use B(a, d) to denote the (closed) Hamming ball of radius d centered at a, i.e., B(a, d) An error correcting code of block length N over alphabet F q is simply a collection of codewords C ⊆ F N q . The distance of the code C, denoted by ∆(C), is defined as min a =b∈C ∆ (a, b). A code is said to be linear if C is a subspace of F N q . For a linear code C, its message length is defined to be the dimension of C, or equivalently log q |C|. We often use the notion [N, K, D] q to denote a linear code of block length N, message length K, and distance D. The rate and relative distance of a linear [N, K, D] q code C are defined as K/N and D/N respectively. Note also that, for a linear code C, ∆(C) is equal to the minimum weight of a non-zero codeword of C. Finally, for any code C, we use A w (C) := |{c ∈ C | ∆(c) = w}| to denote the number of codewords of weight w.
Let us also recall the Singleton bound and the definition of maximum distance separable (MDS) codes.  We note here that the above bound and notation are well-defined (or can be naturally extended) also for non-linear codes, but we will only use them in context of linear codes in this paper.

Miscellaneous Tools
Covering Biclique by Isomorphic Graphs. A useful fact we use to derandomize our reductions is that the biclique can be covered by any dense bipartite graph G with only a few graphs that are isomorphic to G. To state this more formally, let us first define a few notions.
Definition 3.10. For any graph G = (V G , E G ) and any permutation π : V G → V G , we use G π to denote the graph (V G π , E G π ) where the vertex set V G π is equal to V G and E G π = {(π(a), π(b)) | (a, b) ∈ E G }.
For brevity, we say that a permutation π : A∪B → A∪B of vertices of a bipartite graph G = (A∪B, E G ) is side-preserving if π(A) = A and π(B) = B.
We can now state the result as follows. The proof, which proceeds via a simple set covering argument, is deferred to Appendix B.
Translating Finite Fields Vectors to {0, 1}-Vectors. Another simple fact which was already mentioned in the proof overview (Section 2) is that, we can embed Hamming metric on alphabet of size q to Hamming metric on Boolean alphabet, with only q multiplicative factor blow-up in the dimension: Proposition 3.12. For any q, N ∈ N, and alphabet Σ such that |Σ| = q, there exists a mapping ψ : Proof. The mapping ψ simply replaces each coordinate that is equal to j ∈ Σ by the j-th standard basis in the q-dimensional space. More precisely, where • denotes concatenation of vectors and e j denote the j-th standard basis in R q , i.e., the vector whose j-th coordinate is one and the remaining coordinates are zeroes.
It is simple to check that this satisfies the two requirements.

OVH-hardness of Exact Bichromatic Closest Pair
Alman and Williams [AW15] showed the conditional hardness (under OVH) of exact BCP in every ℓ p -metric even when the point-sets are over {0, 1} via a Turing reduction from OV. David, Karthik, and Laekhanukit [DKL18] gave an alternate proof of the same result where point-sets were over R via a many-one reduction from OV. For independent interest, below we give another proof, which is both a many-one reduction and the pointsets are over {0, 1}.
For every i ∈ [n], the i th point of A ′ , say a ′ is constructed from the i th point of A, say a by simply applying T A pointwise on each coordinate of a, i.e., a ′ = (T A (a 1 ), . . . , T A (a d )). Similarly we apply T B pointwise on each coordinate of points in B. It is easy to see that there exists (a

Contact Dimension of a Graph
The central gadget in our reduction from BCP to CP is based on the contact dimension of a graph. Below we reproduce its definition from the proof overview (i.e. Definition 2.1) for convenience.

Definition 3.15 (Contact Dimension [Pac80]). For any graph G = (V, E), a mapping τ :
V → R d is said to realize G (in the ℓ p -metric) if for some β > 0, the following holds: The contact dimension (in the ℓ p -metric) of G, denoted by cd p (G), is the minimum d ∈ N such that there exists τ : V → R d realizing G in the ℓ p -metric.
We may also say that τ β-realizes G if we wishes to emphasize the value of β.
Note here that we may view points in τ(V) as centers of spheres of radius β/2. No two spheres overlap but they may touch, and G has an edge (u, v) if and only if the spheres centered at τ(u) and τ(v) touches.
For a summary of the bounds on cd(G) for various graphs in the Euclidean metric see [Mae85,FM86,FM88,Mae91] and for a summary of the bounds on cd(K n,n ) in various metrics see [DKL18]. For this paper, the following bounds are relevant.  In particular, the above two theorems are the obstacles of the approach of [DKL18] for the ℓ 2 and Hamming metrics respectively. As discussed in the proof overview, we will overcome these barriers by constructing dense bipartite graphs with low contact dimensions in every ℓ p metrics.
As discussed in Section 2.3.2, we need a generalization of contact dimension in order to show inapproximability for CP. This is formally defined below; it should be noted that the definition only makes sense for bipartite graphs, whereas the original contact dimension is well-defined for any graphs. Moreover, when λ = 1, the notion of gap contact dimension coincides with the (non-gap) contact dimension in bipartite graphs.
Definition 3.18 (Gap Contact Dimension). For any bipartite graph G = (A∪B, E) and λ ≥ 1, a mapping τ : V → R d is said to λ-gap-realize G (in the ℓ p -metric) if for some β > 0, the following holds: The λ-gap contact dimension (in the ℓ p -metric) of G, denoted by λ-cd p (G), is the minimum d ∈ N such that there exists τ : V → R d λ-gap-realizing G in the ℓ p -metric.
Again, we may say that τ (β, λ)-gap-realizes G to emphasize the value of β.
Finally, we define an analogous notion for inner product:

Definition 3.19 (Gap Inner Product Dimension). For any bipartite graph G = (A∪B, E)
and λ ≥ 1, a mapping τ : V → R d is said to λ-gap-IP-realize G if for some β > 0, the following holds:

(iii) For all distinct u, v both from A or both from B, τ(u), τ(v) < β/λ.
The λ-gap inner product dimension of G, denoted by λ-ipd(G), is the minimum d ∈ N such that there exists τ : V → R d λ-gap-IP-realizing G.
We may say that τ (β, λ)-gap-IP-realizes G to emphasize the value of β.

Lower Bound on Closest Pair under Orthogonal Vector Hypothesis
In this section, we prove the subquadratic hardness for CP (assuming OVH) using the efficient construction of a realization of a dense bipartite graph. The construction will be be formally stated below and the details will be given in Section 5.2.1. First, we define the notion of a log-dense sequence of integers: Definition 4.1. A sequence (n i ) i∈N of increasing positive integers is said to be log-dense if there exists a constant C ≥ 1 such that log n i+1 ≤ C · log n i for all i ∈ N.
As outlined in Section 2.1 , we use Reed-Solomon codes to construct a family of dense bipartite graphs with low contact dimensions. While the construction does not yield a graph for every number of vertices n, it does yield a graph for a log-dense sequence of numbers of vertices, which turns out to be sufficient for the purpose of the reduction. More formally, we will prove the following in Section 5.2.1.

Theorem 4.2.
For every 0 < δ < 1, there exists a log-dense sequence (n i ) i∈N such that, for every i ∈ N, there is a bipartite graph Notice that we did not specify any ℓ p -metric in the notion of contact dimension above. This is intentional, because our point sets τ(A i∪ B i ) have coordinate entries in {0, 1}, for which the distances in the Hamming metric are equivalent (up to power of p) to distances in any ℓ p -metric (p = ∞). We also adopt this notational convenience below. Specifically, we will prove the following theorem which states that CP is hard even when the points are from {0, 1} d ; clearly, this also implies Theorem 1.4 due to the aforementioned equivalence to other ℓ p -metrics. Proof. For any ε > 0, let C exp be the constant such that the dimension guarantee for τ in Theorem 4.2 is at most (log n i ) C exp /ε for δ = ε/2. We define s ε as 2 · C exp /ε + 2.
Assume that there exists ε > 0 and an algorithm A that can solve CP in time n 2−ε in the Hamming metric for any input of n points in {0, 1} (log n) sε . We will construct an algorithm A ′ that solves any instance of BCP in time n 2−ε ′ for some constant ε ′ > 0 (to be specified below), on n points in dimension d := c ε ′ · log n with coordinate entries in {0, 1}. Together with Theorem 3.13, this implies that OVH is false, arriving at a contradiction.

Let
3. We use the algorithm from Lemma 3.11 to find π 1 , . . . , 4. We assume w.l.o.g. 9 that n is divisible by n ′ . Partition A and B into A 1 , . . . , A n/n ′ and B 1 , . . . , B n/n ′ each of size n ′ . For each i, j ∈ [n/n ′ ], t ∈ [k], do the following: (a) Let τ t be an appropriate permutation of τ that β-realizes G ′ π t . Label the vertices of G ′ π t with the points in A i∪ B j .
i.e., the concatenation of d + 1 copies of v.
(c) Run A on (A t i∪ B t j , α ′ ). If A outputs YES, then output YES and terminate.
5. If none of the executions of A returns YES, then output NO.
Observe that the bottleneck in the running time of the algorithm is in the executions of A. The number of executions is (n/n ′ ) 2 · k and each execution takes O((n ′ ) 2−ε ) time. Hence, in total the running time of the algorithm A ′ is O((n/n ′ ) 2 · k · (n ′ ) 2−ε ) ≤ O(n 2 log n · (n ′ ) −ε/2 ). Now, from the log-density of the sequence from Theorem 4.2, we have n ′ ≥ n 0.1/C ε = n 10ε ′ /ε . As a result, the running time of A is at most To see the correctness of the algorithm, first observe that the dimensions of vectors in A t i , B t j are at most d + (d + 1) · (log n ′ ) C exp /ε which is at most (log n) s ε for any sufficiently large n; that is, the calls to A are valid. Next, observe that, if (A, B, α) is a YES instance of BCP, there must be i, j ∈ [n/n ′ ] and a * ∈ is a YES instance for CP and A ′ outputs YES as desired. Finally, assume that (A, B, α) is a NO instance of BCP. Consider any i, j ∈ [n/n ′ ] and t ∈ [k]. To argue that (A t i ∪ B t j , α ′ ) is a NO instance for CP, we have to show that any two points in A t i ∪ B t j have distance more than α ′ . To see this, let us consider two cases.

One of the point is from A t i and the other from
Combining the two implies that the Hamming distance between a • (1 d+1 ⊗ τ t (a)) and b • (1 d+1 ⊗ τ t (b)) is more than α ′ .
Hence, (A t i∪ B t j , α ′ ) must be a NO instance for CP for every t ∈ [k] and i, j ∈ [n/n ′ ]. Thus, A ′ outputs NO as desired.

Gadget Constructions
In this section, we construct all the gadgets that are used in our reductions, including the basic gadget (Theorem 4.2) and more advanced gadgets used for MIP and approximate version of CP.

Finding a Center of a Code via Another Code
At the heart of all our gadgets is the task of finding a code C 1 and a center s such that there are |C 1 | 1−o(1) many codewords at Hamming distance exactly equal to r (for some r > 0) from s but there is no codeword in C 1 at distance less than r from s. The below lemma is useful in finding such an s. Lemma 5.1. Let C 1 ⊆ C 2 ⊆ F N q be two linear codes with the same block length N and alphabet F q such that ∆(C 2 ) < ∆(C 1 ). Then, there exists a center s ∈ F N q such that (1) ∆(s, C 1 ) ≥ ∆(C 2 ) and (2) |B(s, ∆(C 2 )) ∩ C 1 |/|C 1 | ≥ A ∆(C 2 ) (C 2 )/|C 2 |. Moreover, given C 1 , C 2 , such an s can be found in O(|C 1 | · |C 2 | · qN) time.
Proof. We show that there exists s ∈ C 2 \ C 1 such that (2) holds. Note that (1) immediately holds, because s − c must be a non-zero codeword of C 2 which implies that ∆(s, c) ≥ ∆(C 2 ).
To show that there exists s ∈ C 2 \ C 1 such that |B(s, ∆(C 2 )) ∩ C 1 | ≥ |C 1 | · A ∆(C 2 ) /|C 2 |. We will in fact show a stronger statement: for a random s ∈ C 2 \ C 1 , we have . Plugging this back into the above equality, we have Thus, there must exist a center s ∈ C 2 \ C 1 that satisfies (2) (and also (1)) as desired.
Finally, note that s can be found by a brute force algorithm that tries every s ∈ C 2 and check whether (2) is satisfied; this algorithm takes O(|C 1 | · |C 2 | · qN) time.

Gadgets based on Reed-Solomon Codes
In this subsection, we construct gadgets based on the Reed Solomon codes, which are defined below.
In order to find a good center s, we use the following (well-known) bound on the number of minimum weight codewords of Reed Solomon codes (and more generally MDS codes). For a reference of this bound, see e.g. [MS77, Ch. 11, Theorem 6].

The Basic Gadget: Dense Bipartite Graphs with Low Contact Dimensions
Now we construct a dense bipartite graph with low contact dimension. A proof sketch of this construction was provided in Section 2.1 and was formally stated as Theorem 4.2.
Proof of Theorem 4.2. Let q i be the i-th prime number and let n i = (q i ) (⌊q δ i ⌋) ; it is simple to see that the sequence (n i ) i∈N is log-dense. For q = q i , consider the Reed-Solomon codes C 1 = RS q [q, K 1 ] and C 2 = RS q [q, K 2 ] where K 1 = ⌊q δ ⌋ and K 2 = K 1 + 1. Applying Lemma 5.1 with (C 1 , C 2 ) implies that there exists a center s ∈ C 2 such that where the last equality follows from the fact that |C 1 | = q K 1 .
We construct the graph G i = (A i , B i , E i ) and a realization τ as follows. Let can be easily realized by applying the mapping ψ : F q q → {0, 1} q 2 from Proposition 3.12. More precisely, let τ be the restriction of ψ on A i ∪ B i . Below we argue about the density of G i and that τ is a 2∆(C 2 )-realization of G i .
• Second, notice that, for every v 1 , v 2 both from A i or both from . Moreover, the inequality is an equality if and only if ∆(a, b) = ∆(C 2 ), i.e., (a, b) ∈ E i as desired.
As for the running time of constructing G i and τ, observe that the bottleneck is the running time needed to find the center s; according to Lemma 5.1, s can be computed in O(|C 1 | · |C 2 | · q 2 ) = O(n 2 i · q 2 ), which is n 2+o(1) i as desired.

A Gadget for Maximum Inner Product
Now, we build gadgets (stated below) which will be used for proving the inapproximability of MIP.
Theorem 5.4. For every 0 < δ < 1, there exists a log-dense sequence (n i ) i∈N such that, for every i ∈ N, there is a bipartite graph Proof. The proof here is exactly the same as the proof of Theorem 4.2, except that we will not pick K 2 = K 1 + 1, but rather pick K 2 > 3K 1 (and n i accordingly).
More precisely, let q i be the i-th prime number and let n i = (q i ) (⌊q 0.3δ i /3⌋) ; it is simple to see that the sequence (n i ) i∈N is log-dense. For q = q i , consider the Reed-Solomon codes C 1 = RS q [q, K 1 ] and C 2 = RS q [q, K 2 ] where K 1 = ⌊q 0.3δ /3⌋ and K 2 = 3K 1 + 1. Similar to the proof of Theorem 4.2, applying Lemma 5.1 with (C 1 , C 2 ) implies that there exists s ∈ C 2 \ C 1 such that We construct the graph G i = (A i , B i , E i ) and a realization τ as follows.
}. G i can be easily 3-gap-IP-realized by applying the mapping ψ : F q q → {0, 1} q 2 from Proposition 3.12. More precisely, let τ be the restriction of ψ on A i ∪ B i . Below we argue about the density of G i and that τ is a (K 2 − 1, 3)-gap-IP-realization of G i .
• Second, for every v 1 , v 2 both from A i or both from Moreover, the inequality is an equality if and only if ∆(a, b) = ∆(C 2 ), i.e., (a, b) ∈ E i as desired.

Gadgets based on AG Codes
In this subsection, we construct gadgets based on algebraic geometric (AG) codes. The definitions of AG Codes are well beyond the scope of this work and we refer the readers to [Sti08,VNT07] for more thorough introductions.
Once again to find a good center, we need a bound on the number of minimum weight codewords. On this front, we use the following bound 10 from [Vlȃ18]. Throughout this subsection, we follow the notations from [Vlȃ18].
Theorem 5.5 (Theorem 4.3 of [Vlȃ18]). Let q be a prime power, X be a curve of genus g over F q , let S ⊆ X(F q ) such that |S| = N, and let a ∈ N with 1 ≤ a ≤ N − 1. Then, there exists an F q -positive divisor D ≥ 0, deg(D) = a, such that the corresponding AG Code C = C(X, D, S) has minimum distance N − a and We also need the following well-known (central) fact about the parameters of AG codes.
Theorem 5.6. Let q be a prime power, X be a curve of genus g over F q , let S ⊆ X(F q ) such that |S| = N, and let a ∈ N with 1 ≤ a ≤ N − 1. Then, the corresponding AG Code C = C(X, D, S) is a linear code over F q with block length N, distance at least N − a and message length k ≥ a − g + 1.
Recall also the tower of functions of Garcia and Stichtenoth [GS96], whose parameters approach the TVZ bound. We note here that, it suffices for us to have the genus approaching Ω(N/ √ q) and there are also other curves that satisfy this.

Theorem 5.7 ([GS96]
). For any ζ > 0 and any square of prime q, there exists a dense sequence 11 (N i ) i∈N such that there exists a curve X i with genus at most N i Plugging the bound from [Vlȃ18] into the above family of curves immediately yields the following: Lemma 5.8. For any ζ > 0 and any square of prime q, there exists a dense sequence (N i ) i∈N such that the following holds. For any i ∈ N and any a 1 , a 2 ∈ N such that 1 ≤ a 1 < a 2 ≤ N i − 1, there exists linear codes C 1 ⊆ C 2 ⊆ F N i q such that the following holds, where g i = N i √ q−1 + ζ: • C 1 has message length at least a 1 − g i + 1 and distance at least N i − a 1 .
• C 2 has message length at least a 2 − g i + 1 and distance exactly N i − a 2 and Moreover, the generator matrices of C 1 , C 2 can be computed in O ( N+a 2 −1 Proof. Let (N i ) i∈N be a dense sequence as in Theorem 5.7. From Theorem 5.5, there exists an F q -positive divisor D 2 of degree a 2 such that the corresponding code C 2 = C(X i , D 2 , S i ) (where S ⊆ X i (F q ) of size N i ) satisfies (3) and that its distance is N i − a 2 ; from Theorem 5.6, its message length must also be at least a 2 − g i + 1. Next, let D 1 be any F q -positive divisor of degree a 1 such that D 2 − D 1 ≥ 0. Let C 1 = C(X i , D 1 , S i ) be the corresponding AG code; once again, Theorem 5.6 yields the desired bounds on its message length and distance. Finally, observe that D 2 − D 1 ≥ 0 implies that C 1 ⊆ C 2 as desired.
The main bottleneck to algorithmically construct such codes lies in finding D 2 . Nevertheless, the total number of degree-a 2 F q -positive divisor is only ( N i +a 2 −1 a 2 ). We can use brute force to enumerate all of them and check whether the corresponding code satisfies (3), which further takes |C 2 | time. This results in the claimed running time.
Finally, we can now construct our gadgets, by an appropriate setting of parameters. In particular, a 1 and a 2 will be selected to be close to each other and to both be slightly larger than N/ √ q. This results in the graphs whose degrees are roughly square root of the number of vertices.
Theorem 5.9. For every 0 < δ < 1, there exist µ > 0 and a log-dense sequence (n i ) i∈N such that, for every i ∈ N, there is a bipartite graph Proof. Once again, the proof here is similar to those of Theorems 4.2 and 5.4, except that we use the (pairs of) AG codes from Lemma 5.8 instead of Reed-Solomon codes.
Let q ≥ 49 be any sufficiently large square of prime and ζ > 0 be any sufficiently small positive real number (both to be precisely specified later).
Let (N i ) i∈N be the sequence guarantee by Lemma 5.8. Let a 1 = N i · 1 q 0.5(1−δ) − 1 q and a 2 = N i q 0.5(1−δ) . For convenience, we assume that a 1 and a 2 are integers 12 . Let C 1 , C 2 be the codes given by Lemma 5.8. The sequence (n i ) i∈N is defined as n i = |C 1 |.
Applying Lemma 5.1 to (C 1 , C 2 ) implies that there exists s ∈ C 2 \ C 1 such that where o(1) terms above denote the terms that go to zero as q → ∞ and ζ → 0. As a result, by picking q sufficiently large and ζ sufficiently small, the term in (4) is at least Ω(|C 1 | −0.5−δ ).
We construct the graph G i = (A i , B i , E i ) and a realization τ as follows.
can be easily realized by applying the mapping ψ : F q q → {0, 1} q 2 from Proposition 3.12. More precisely, let τ be the restriction of ψ on A i ∪ B i . Below we argue about the density of G i and that τ is a (2∆( Let us now check that G i and τ satisfy all the claimed properties: ).
Moreover, the running time to construct C 1 and C 2 , as given by Lemma 5.8, is where the last two inequalities are true for any sufficiently large q.

Inapproximability of Maximum Inner Product
In this section, we prove the hardness of approximating MIP. Once again, we show a stronger version (than Theorem 1.6) where every point has Boolean coordinates, as stated below.
The proof proceeds in two steps: first, we show hardness of approximating MIP in low dimension but with a small (1 + o(1)) approximation factor. Second, we use tensor product operation to amplify the gap to be almost polynomial, as stated in Theorem 6.1. More specifically, in the first step, we prove the following: Theorem 6.2. Assuming OVH, for every ε > 0, there exists s ε > 0 such that no algorithm running in O(n 2−ε ) time can solve 1 + 1 log log n -MIP even for points in {0, 1} (log n) sε .
Note that the factor 1 log log n is not significant, and this can be replaced by any o(1) factor; we use this just to make the calculations more concrete. Before we move on to the proof of Theorem 6.2, let us first show how it implies Theorem 6.1.
In other words, if there is an O(n 2−ε ) time algorithm for γ-MIP in n o(1) dimension, then there also exist an O(n 2−ε ) subquadratic time algorithm for 1 + 1 log log n -MIP in (log n) s ε dimension. Thus, Theorem 6.1 follows from Theorem 6.2.
The rest of this section is devoted to proving Theorem 6.2. To do so, we consider the gap-Additive-BMIP problem. Definition 6.3 (γ-Additive-BMIP problem). Let γ ≥ 0. In the γ-Additive-BMIP problem we are given two sets A, B each of n points in {0, 1} d and an integer α ∈ [d] as input, and the goal is to distinguish between the following two cases.
• Completeness. There exists (a, b) ∈ A × B such that a, b ≥ α.
• Soundness. For every (a, b) We need the below hardness result from [Rub18]. Note that the result is stated differently in [Rub18]; for how the result in [Rub18] implies the one below, see Section 3.2 of [Che18a]. Proof of Theorem 6.2. For any ε > 0, let C exp be the constant such that the dimension of τ in Theorem 5.4 is at most (log n i ) C exp /ε for δ = ε/2. We define s ε as 2 · C exp /ε + 2.
Suppose contrapositively that there exists ε > 0 and an algorithm A that can solve 1 + 1 log log n -MIP of dimension (log n) s ε in time n 2−ε . We will construct an algorithm A ′ that solves (log n)-Additive-BMIP in time n 2−ε ′ for some constant ε ′ > 0 (to be specified below) for d = (log n log log n) dimensions. Together with Theorem 6.4, this implies that OVH is false, as desired.
Let C ε denote the constant of the log-dense sequence from Theorem 5.4 for δ = ε/2, and let ε ′ be 0.01 · ε/C ε . The algorithm A ′ on input (A, B, α) works as follows: 1. Let n ′ be the largest number in the sequence from Theorem 5.4 with δ = ε/2 s.t.

Let
3. We use the algorithm from Lemma 3.11 to find π 1 , . . . , (c) Run A on (A t i∪ B t j , α ′ ). If A outputs YES, then output YES and terminate.
5. If none of the executions of A returns with YES, then output NO.
Observe that the bottleneck in the running time of the algorithm is in the executions of A. The number of executions is (n/n ′ ) 2 · k and each execution takes O((n ′ ) 2−ε ) time. Hence, in total the running time of the algorithm A ′ is O((n/n ′ ) 2 · k · (n ′ ) 2−ε ) ≤ O(n 2 log n · (n ′ ) −ε/2 ). Now, from the log-density of the sequence from Theorem 5.4, we have n ′ ≥ n 0.1/C ε = n 10ε ′ /ε . As a result, the running time of A is at most O(n 2−5ε ′ log n) ≤ O(n 2−ε ′ ) as desired.
To see the correctness of the algorithm, first observe that the dimensions of vectors in A t i , B t j are at most β · d + 3d · (log n ′ ) C exp /ε which is at most (log n) s ε for any sufficiently large n; that is, the calls to A are valid. Next, observe that, if (A, B, α) is a YES instance of Additive-BMIP, there must be i, j ∈ [n/n ′ ] and a * ∈ is a YES instance for MIP and A ′ outputs YES as desired. Finally, let us assume that (A, B, α) is a NO instance of (log n)-Additive-BMIP. Consider any i, j ∈ [n/n ′ ] and t ∈ [k]. To argue that (A t i ∪ B t j , α ′ ) is a NO instance for 1 + 1 log log n ′ -MIP, we have to show that any two points in A t i ∪ B t j have inner product less than α ′ / 1 + 1 log log n ′ . To see this, let us consider two cases.
1. The two points are either both from A t i or both from B t j . Assume w.l.o.g. that the two points are from A t i ; let them be (1 β ⊗ a) • (1 3d ⊗ τ t (a)) and (1 β ⊗ a ′ ) • (1 3d ⊗ τ t (a ′ )). Recall that, from Theorem 5.4, we must have τ t (a), τ t (a ′ ) < β/3. Moreover, since a, a ′ ∈ {0, 1} d , we have a, a ′ ≤ d. Thus, we can conclude that which is less than α ′ / 1 + 1 log log n ′ for any sufficiently large n. (A, B, α) is a NO instance of (log n)-Additive-BMIP, we must have a, b < α − log n. Furthermore, from Theorem 5.4, we must have τ t (a), τ t (b) ≤ β. Combining the two implies that

One of the point is from
where the second-to-last inequality holds for any sufficiently large n.
Hence, (A t i∪ B t j , α ′ ) must be a NO instance for 1 + 1 log log n ′ -MIP for every t ∈ [k] and i, j ∈ [n/n ′ ]. Thus, A ′ outputs NO as desired.

Inapproximability of Closest Pair
In this section, we prove the hardness of approximating CP (Theorem 1.5). As usual, we reduce from the bichromatic version of the problem, and the lower bound for the bichromatic version is stated below: Theorem 7.1 (Rubinstein [Rub18]). Assuming OVH, for every ε > 0 there exists κ > 0 such that there is no algorithm running in n 2−ε time for (1 + κ)-BCP in the Hamming metric. Moreover, this holds even for instances (A, B, α) of (1 + κ)-BCP when d = Θ ε (log n), α = Θ ε (log n) and A, B ⊆ {0, 1} d .
Again, we prove below the inapproximability of the gap-CP problem for Boolean vectors. Clearly, this immediately implies Theorem 1.5.
Theorem 7.2. Assuming OVH, for every ε > 0, there exists θ > 0 and c > 0 such that there is no algorithm running in n 1.5−ε time for (1 + θ)-CP in the Hamming metric for point-set in {0, 1} c·log n .
Proof. Assume towards a contradiction that there exists an ε > 0 and an algorithm A that, for every θ > 0 solves (1 + θ)-CP of dimension c · log n in time O(n 1.5−ε ), where c := c(ε) is a constant that will be specified later. Let ε ′ > 0 be a small constant (depending on ε) that we will specify below and let κ = κ(ε ′ ) be as in Theorem 7.1. We construct below an algorithm A ′ that solves (1 + κ)-BCP in time O(n 2−ε ′ ) for any instance (A, B, α) such that A, B ⊆ {0, 1} O(log n) and α = Θ(log n). Together with Theorem 7.1, this implies that OVH is false, as desired.
(c) Let α ′ = r 1 · α + r 2 · β and define A t i , B t j as . If A outputs YES, then output YES and terminate.
5. If none of the executions of A returns with YES, then output NO.
Observe that the bottleneck in the running time of the algorithm is in the executions of A. The number of executions is (n/n ′ ) 2 · k and each execution takes O((n ′ ) 1.5−ε ) time. Hence, in total the running time of the algorithm A ′ is O((n/n ′ ) 2 · k · (n ′ ) 1.5−ε ) ≤ O(n 2 log n · (n ′ ) −ε/2 ). Now, from the log-density of the sequence from Theorem 5.9, we have n ′ ≥ n 0.1/C ε = n 10ε ′ /ε . As a result, the running time of A is at most O(n 2−5ε ′ log n) ≤ O(n 2−ε ) as desired.
To see the correctness of the algorithm, first observe that the dimensions of vectors in A t i , B t j are at most r 1 · α + r 2 · β which is O(log n ′ ); that is, the calls to A are valid.
Next, observe that, if (A, B, α) is a YES instance of BCP, there must be i, j ∈ [n/n ′ ] and a * ∈ A i , b * ∈ B j such that a * − b * 0 is at most α. Since G ′ π 1 , . . . , G ′ π k covers K n ′ ,n ′ , there must be t ∈ [k] such that τ t (a * ) − τ t (b * ) 0 ≤ β. As a result, ((1 r 1 ⊗ is a YES instance for CP and A ′ outputs YES as desired. Finally, let us assume that (A, B, α) is a NO instance of (1 + κ)-BCP. Consider any i, j ∈ [n/n ′ ] and t ∈ [k]. To argue that (A t i ∪ B t j , α ′ ) is a NO instance for (1 + θ)-CP, we have to show that any two points in A t i ∪ B t j have distance more than α ′ . To see this, let us consider two cases.
Hence, (A t i∪ B t j , α ′ ) must be a NO instance for (1 + θ)-CP for every t ∈ [k] and i, j ∈ [n/n ′ ]. Thus, A ′ outputs NO as desired.

Discussion and Open Questions
It remains open to completely resolve Open Questions 1.1 and 1.2. It is still possible that our framework can be used to resolve these problems: we just need to construct gadgets with better parameters! In particular, to resolve Question 1.1, we have to improve the dimension bound in Theorem 4.2 to O δ (log n i ). For Question 1.2, we just have to improve the bound on the number of pairs in (3) of Theorem 5.9 to Ω(n 2−δ i ). Following our observation from Lemma 5.1, this motivates us to ask the following purely coding theoretic question: Open Question 8.1. For every 0 < δ < 1, are there linear codes C 1 ⊆ C 2 ⊆ F N q both of block length N over alphabet F q such that the following holds: • ∆(C 1 ) ≥ (1 + f (δ)) · ∆(C 2 ), for some f : (0, 1) → (0, 1).
Apart from the aforementioned questions, Rubinstein [Rub18] pointed out an interesting obstacle, aptly dubbed the "triangle inequality barrier", to obtain fine-grained lower bounds against 3-approximation algorithms for BCP (see Open Question 3 in [Rub18]). In the case of CP, this barrier turns out to be against 2-approximation algorithms as noted in [DKL18]. We reiterate this below as an open problem to be resolved: Open Question 8.2. Can we show that assuming SETH, for some constant ε > 0, no algorithm running in time n 1+ε can solve 2-CP in any metric when the points are in ω(log n) dimensions?
Another interesting direction is to extend the hardness of MIP to the k-vector generalization of the problem, called k-MIP. In k-MIP, we are given a set of n points P ⊆ R d and we would like to select k distinct points a 1 , . . . , a k ∈ P that maximizes a 1 , . . . , a k := ∑ j∈ [d] (a 1 ) j · · · (a k ) j .
It is known that the k-chromatic variant of k-MIP is hard to approximate (see Appendix B of [KLM18]) but this is not known to be true for k-MIP itself. Our approach seems quite compatible to tackling this problem as well; in particular, if we can construct a certain (natural) generalization of our gadget for MIP, then we would immediately arrive at the inapproximability of k-MIP even for {0, 1}-entries vectors. The issue in constructing this gadget is that we are now concerned about agreements of more than two vectors, which does not correspond to error-correcting codes anymore and some additional tools are needed to argue for this more general case.
It should be noted that the hardness of approximating k-MIP for {0, 1}-entry vectors is equivalent to the one-sided k-biclique problem [Lin18], in which a bipartite graph is given and the goal is to select k vertices on the right that maximize the number of their common neighbors. The equivalence can be easily seen by viewing the coordinates as the left-hand-side vertices and the vectors as the right-hand-side vertices. The one-sided k-biclique is shown to be W[1]-hard to approximate by Lin [Lin18] who also showed a lower bound of n Ω( √ k) for the problem assuming ETH. If the generalization of our gadget for k-MIP works as intended, then this lower bound can be improved to n Ω(k) under ETH and even n k−o(1) under SETH.
The one-sided k-biclique is closely related to the (two-sided) k-biclique problem, where we are given a bipartite graph and we wish to decide whether it contains K k,k as a subgraph. The k-biclique problem was consider a major open problem in parameterized complexity (see e.g., [DF13]) until it was shown by Lin to be W[1]-hard [Lin18]. Nevertheless, the running time lower bound known is still not tight: currently, the best lower bound known for this problem is n Ω( √ k) both for the exact version (under ETH) [Lin18] and its approximate variant (under Gap-ETH) [CCK + 17]. It remains an interesting open question to close the gap between the above lower bounds and the trivial upper bound of n O(k) . Progresses on the one-sided k-biclique problem could lead to improved lower bounds for k-biclique problem too, although several additional steps have to be taken care of. of the biclique. By doing so, we guarantee that the process ends in O(log n) · n 2 /|E G | steps. Note however that, there are exponential number of isomorphisms and thus we cannot simply enumerate all isomorphisms to find one that covers the desired fraction of uncovered edges. Nevertheless, it is not hard to see that we can use the method of conditional expectation to find one such isomorphism in polynomial time. This is formalized below.
Lemma B.1. For any two bipartite graphs G = (A∪B, E G ) and H = (A∪B, E H ), there exists a side-preserving permutation π : A∪B → A∪B such that Moreover, such a permutation π can be found (deterministically) in O((|A| + |B|) 4 ) time.
Proof. Notice that, if we pick π| A and π| B randomly among all permutations of A and B respectively, then, for a fixed (a, b) ∈ E H , the probability that (a, b) belongs to E G π is |E G | |A|·|B| . Thus, This proves the existence part of the claim. To deterministically find such a π, we use the method of conditional expectation. Suppose A∪B = {1, . . . , n}. The algorithm works as follows: 1. Let V assigned ← ∅.
(b) For each k ∈ V candidate , compute the conditional expectation: Let k * be the maximizer for the above conditional expectation. We set π * (i) = k * .
It is simple to see that the conditional expectation never decreases as we fill in the permutation. As a result, we must have |E H ∩ E G π | ≥ |E G |·|E H | |A|·|B| as desired. Moreover, it is easy to see that the conditional expectation can be computed in time O(|A| · |B|) because, for each edge (a, b) ∈ E H , we can compute the probability that (a, b) ∈ E G π in O(1) time. As a result, the overall running time of the algorithm is O((|A| + |B|) 4 ).
Finally using Lemma B.1, we prove Lemma 3.11 using the strategy outlined earlier in this section.