Embedding Graphs as Euclidean kNN-Graphs

Schibler, Thomas; Suri, Subhash; Xue, Jie

doi:10.4230/LIPIcs.SoCG.2025.73

Embedding Graphs as Euclidean $k$ NN-Graphs

Thomas Schibler

University of California, Santa Barbara, CA, USA Subhash Suri

University of California Santa Barbara, CA, USA Jie Xue

New York University Shanghai, China

Abstract

Let $G=(V,E)$ be a directed graph on $n$ vertices where each vertex has out-degree $k$ . We say that $G$ is $k$ NN-realizable in $d$ -dimensional Euclidean space if there exists a point set $P=\{p_{1},p_{2},\ldots,p_{n}\}$ in $\mathbb{R}^{d}$ along with a one-to-one mapping $\phi:V\rightarrow P$ such that for any $u,v\in V$ , $u$ is an out-neighbor of $v$ in $G$ if and only if $\phi(u)$ is one of the $k$ nearest neighbors of $\phi(v)$ ; we call the map $\phi$ a $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ . The $k$ NN-realization problem, which aims to compute a $k$ NN-realization of an input graph in $\mathbb{R}^{d}$ , is known to be NP-hard already for $d=2$ and $k=1$ [Eades and Whitesides, Theoretical Computer Science, 1996], and to the best of our knowledge has not been studied in dimension $d=1$ . The main results of this paper are the following:

$\blacksquare$

For any fixed dimension $d\geq 2$ , we can efficiently compute an embedding realizing at least a $1-\varepsilon$ fraction of $G$ ’s edges, or conclude that $G$ is not $k$ NN-realizable in $\mathbb{R}^{d}$ .
$\blacksquare$

For $d=1$ , we can decide in $O(kn)$ time whether $G$ is $k$ NN-realizable and, if so, compute a realization in $O(n^{2.5}\mathsf{poly}(\log n))$ time.

Keywords and phrases:

Geometric graphs,

k

-nearest neighbors, graph embedding, approximation algorithms

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Design and analysis of algorithms

Editors:

Oswin Aichholzer and Haitao Wang

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The $k$ NN-graph of a set $P$ of points in $\mathbb{R}^{d}$ is a directed graph with vertex set $P$ and edges defined as follows: there is a directed edge from $a\in P$ to $b\in P$ if and only if $b$ is one of the $k$ -nearest neighbors of $a$ in $P$ (under the Euclidean distance). A directed graph $G=(V,E)$ is $k$ NN-realizable in $\mathbb{R}^{d}$ if it is isomorphic to the $k$ NN-graph of a set of points in $\mathbb{R}^{d}$ . In this paper, we consider the following natural problems regarding $k$ NN-graphs: Can we efficiently check whether a given directed graph is $k$ NN-realizable in $\mathbb{R}^{d}$ ? If so, can we efficiently find a $k$ NN-realization of $G$ ? More formally, a $k$ NN-realization (or $k$ NN-embedding) of $G$ in $\mathbb{R}^{d}$ is a one-to-one mapping $\phi:V\rightarrow P$ from the vertex set $V$ of $G$ to a set $P$ of points in $\mathbb{R}^{d}$ that induces an isomorphism between $G$ and the $k$ NN-graph of $P$ . Throughout the paper, we use the terms embedding and realization interchangeably.

The $k$ NN-graph is a member of the well-known family of proximity graphs in computational geometry that includes minimum spanning trees, relative neighborhood graphs, Gabriel graphs, and Delaunay triangulations [11]. Over the past several decades, a substantial research effort has been directed to design efficient algorithms to compute these structures for an input set of points. The inverse problem – the focus of our paper – where we want to recover the points that produce a given proximity graph, however, remains less well understood.

An early example of a positive result in this direction is on Euclidean minimum spanning trees. Monma and Suri [22] show that any tree with maximum node degree $5$ can be realized as a minimum spanning tree of points in the plane and, additionally, every planar point set admits a minimum spanning tree with degree at most $5$ . This settles the above problem for non-degenerate planar point sets but if we allow co-circularities, then a planar minimum spanning tree can have maximum node degree $6$ ; in that case, the problem of computing a 1NN-realization was proved to be NP-complete by Eades and Whitesides [13]. Dillencourt [12] considers the problem of recognizing triangulations that can be realized as Delaunay triangulations in the plane, and it is also known that all outerplanar triangulations are realizable [24]. (These proofs are non-constructive, however, and only exponential time algorithms are known for computing the coordinates of the points in the realization [1].) Among other related results, Bose et al. [5] give a characterization of trees that can be realized in the plane as relative neighborhood or Gabriel graphs, and Eppstein et al. [15] establish some structural properties of $k$ NN graphs of random points in the plane.

The $k$ NN-realization is not directly related to the metric embedding problem but there are some obvious similarities. The input to metric embedding is a weighted graph satisfying triangle inequalities and the goal is to find an embedding realizing all pairwise distances. This is not always possible [20] – there are simple metrics that cannot be embedded in any finite dimensional Euclidean space. However, if we allow some distortion (i.e. approximation) of distances, then any $n$ -node metric graph can be embedded in $O(\log n)$ dimensional space with polylog distortion [6], or using the celebrated Johnson-Lindenstrauss theorem [18] in dimension $O(\log n/\varepsilon^{2})$ with distortion $1+\varepsilon$ for any fixed constant $\varepsilon>0$ .

In metric embedding the goal is to realize all pairwise distances (approximately), while in $k$ NN-realization the goal is only to preserve all ordinal neighbor relations in a directed graph: a neighbor of each vertex must be closer than any of its non-neighbors but the choice of specific distances does not matter. Although there is some work in embedding ordinal relations as well, the main focus is specialized metric spaces and bounds on ordinal distortion. In particular, Alon et al. [2] consider embeddings with ordinal relaxations into ultrametric spaces, and Bădoiu et al. [7] improve their bounds for tree and line embeddings. A different line of research concerns triplets embedding, where the input is a set of triples $(a,b,c)$ specifying constraints of the form $d(a,b)<d(a,c)$ . Even for embedding in one dimension $\mathbb{R}^{1}$ , the general triplet problem as well as the related “betweenness” problem is MAXSNP-hard [9], and a FPTAS is known for maximizing the number of satisfied triplet constraints [17]. (If all $\binom{n}{3}$ triplet constraints are given, then the problem is trivial to solve – indeed, once the leftmost point is determined, the rest of the ordering can be easily decided.) In contrast with the general triplet constraints, the pairwise relations in a $k$ NN-realization problem in $\mathbb{R}^{1}$ have a richer structure that allows us to solve the problem efficiently.

The $k$ NN-realization is also related to the sphericity of graphs [19], where the goal is to embed a (undirected) graph $G=(V,E)$ into an Euclidean space $\mathbb{R}^{k}$ such that there is a point $p(u)$ for each vertex $u\in V$ , and $d(p(u),p(v))\leq 1$ iff $(u,v)\in E$ . The smallest dimension $k$ that admits such an embedding is called the sphericity of $G$ [19]. (Another related problem is the dimension of a graph, introduced by Erdős, Harary and Tutte [16]: it is the smallest number $k$ such that $G$ can be embedded into Euclidean $k$ -space with every edge of $G$ having length $1$ .) Along these lines, Chatziafratis and Indyk [8] also show that if we want to preserve the relative distances among the $k$ nearest neighbors of each point, then the embedding dimension must grow linearly with $k$ . Unlike these results, our work is algorithmic – we design efficient algorithms to realize a given graph $G$ in a specified dimension $d$ .

Euclidean embedding of neighbor relations is also studied in social sciences for geometric realization of preference orderings. Given a set $V$ of $n$ voters, a set $C$ of $m$ candidates, and a rank ordering of the candidates by each voter, we say that the preference graph can be realized in Euclidean $d$ -space if each voter’s preferences are consistent with its Euclidean distances to all the candidates. In this case, the smallest dimension must satisfy $d\geq\min\{n,m-1\}$ [4].

1.1 Main results

We study the $k$ NN-realization problem, which takes a directed graph $G$ as an input and aims to compute a $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ (or decide the nonexistence of such a realization). Throughout we focus on the unranked $k$ NN-realization problem where nearest neighbors are realized as an unordered set but our results also hold (with minor caveat) for the ranked version where edges of $G$ specify each of the $k$ neighbors in order (see Concluding Remarks). Our two main results are the following.

1.

Assuming that $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ , we can find a $d$ -dimensional realization in polynomial time that preserves at least a $1-\varepsilon$ fraction of the edges of $G$ , for any fixed $\varepsilon>0$ . If our algorithm fails, we can also conclude that $G$ is not $k$ NN-realizable in $\mathbb{R}^{d}$ . In particular, our algorithm is an EPTAS (efficient polynomial-time approximation scheme) for the $k$ NN-realization problem for fixed $k$ and $d$ .
2.

We give a linear-time algorithm to decide wether $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ . Specifically, the algorithm runs in $O(kn)$ time, which is linear in the size of the input graph. If $G$ is realizable, our algorithm can compute a realization in $O(n^{2.5}\mathsf{poly}(\log n))$ time.

1.2 Basic definitions

For a (directed or undirected) graph $G$ , we use $V(G)$ and $E(G)$ to denote the set of its vertices and edges, respectively. If $G$ is a directed graph, we say $G$ is $k$ -regular if the out-degree of every vertex of $G$ is exactly equal to $k$ . We use $\mathsf{in}[v]$ (resp., $\mathsf{out}[v]$ ) to denote the set consisting of $v$ itself and all in-neighbors (resp., out-neighbors) of $v$ . A $k$ NN-realization of $G=(V,E)$ in a metric space $\mathcal{M}=(M,\mathsf{dist})$ is an injective map $\phi:V\rightarrow M$ such that $\mathsf{dist}(\phi(v),\phi(u))<\mathsf{dist}(\phi(v),\phi(u^{\prime}))$ , for any distinct triple $v,u,u^{\prime}\in V$ where $(v,u)\in E$ and $(v,u^{\prime})\notin E$ . We say $G$ is $k$ NN-realizable in $\mathcal{M}$ if there exists a $k$ NN-realization of $G$ in $\mathcal{M}$ .

2 Approximate $𝒌$ NN-realization in $\mathbb{R}^{d}$

We first observe that it is easy to decide in polynomial time whether a given graph $G$ is $k$ NN-realizable in a finite-dimensional Euclidean space. We just have to check the acyclicity of an auxiliary graph defined as follows. Let $\varLambda_{G}$ be a directed graph whose vertices correspond to pairs of vertices in $V(G)$ with a directed edge $(\{v,u\},\{v,u^{\prime}\})$ for every distinct triple $v,u,u^{\prime}\in V(G)$ where $(v,u)\in E(G)$ and $(v,u^{\prime})\notin E(G)$ . Intuitively, $\varLambda_{G}$ encodes the $\leq$ -relation among the pairwise distances of points in any (potential) $k$ NN-realization of $G$ . If $\varLambda_{G}$ has a directed cycle, then clearly $G$ is not $k$ NN-realizable. Otherwise, a topological sort of $\varLambda_{G}$ gives a total ordering of the vertex pairs in $G$ . With this ordering, we can find a $k$ NN-realization of $G$ in $\mathbb{R}^{n}$ using the result of Bilu and Linial [3].

On the other hand, deciding whether $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ , for a specific dimension $d$ , is $N P$ -hard. This is shown by Eades and Whitesides [14] who proved the hardness for $d=2$ and $k=1$ . Therefore, in this section, we explore an approximation algorithm for the realization problem in any fixed dimension $d$ . We first need to define an approximate solution to our problem. There is a natural way to measure how well a map $\phi:V(G)\rightarrow\mathbb{R}^{d}$ approximates a $k$ NN-realization: randomly sample an edge $(u,v)\in E(G)$ and consider the probability that $\phi(v)$ is among the $k$ nearest neighbors of $\phi(u)$ . Formally, we introduce the following definition.

Definition 1 (approximate $k$ NN-realization).

Let $G$ be a $k$ -regular directed graph. For $c\in[0,1]$ , a map $\phi:V(G)\rightarrow\mathbb{R}^{d}$ is a $𝐜$ -approximate $𝐤$ NN-realization of $G$ in $\mathbb{R}^{d}$ if

\sum_{(u,v)\in E(G)}\sigma_{\phi}(u,v)\geq c\cdot|E(G)|,

where $\sigma_{\phi}:V(G)\times V(G)\rightarrow\{0,1\}$ is the indicator function defined as $\sigma_{\phi}(u,v)=1$ if we have $|\{v^{\prime}\in V(G)\backslash\{u\}:\lVert\phi(u)-\phi(v^{\prime})\rVert_{2}% \leq\lVert\phi(u)-\phi(v)\rVert_{2}\}|\leq k$ and $\sigma_{\phi}(u,v)=0$ otherwise.

Our main result is an algorithm for computing a $(1-\varepsilon)$ -approximate $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ with time complexity $f(k,d,\varepsilon)\cdot n^{O(1)}$ , for any given $\varepsilon>0$ (provided that $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ ), where $f$ is some computable function. In other words, for fixed $k$ and $d$ , we obtain an efficient polynomial-time approximation scheme (EPTAS) for the $k$ NN-realization problem in $\mathbb{R}^{d}$ .

At a high level, our algorithm consists of two main steps. In the first step, it computes a set $E\subseteq E(G)$ of edges such that $|E|\leq\varepsilon|E(G)|$ and each weakly-connected component of $G-E$ contains $O_{\varepsilon}(1)$ vertices. The existence of $E$ follows from the result of Miller et al. [21] on graph separators, and the computation of $E$ relies on the approximate minimum cut algorithm of Chuzhoy et al. [10]. The edges in $E$ are the ones we sacrifice in our approximation and so, in the second step, our algorithm computes a map $\phi:V(G)\rightarrow\mathbb{R}^{d}$ satisfying $\sigma_{\phi}(u,v)=1$ for all edges $(u,v)\in E(G)\backslash E$ . This turns out to be easy, since the weakly-connected components of $G-E$ are of size $O_{\varepsilon}(1)$ and we can work on these components individually. These two steps will be presented in Sections 2.1 and 2.2, respectively.

2.1 Splitting the graph by removing few edges

A balanced cut of an undirected graph $H$ is a subset $E\subseteq E(H)$ such that every connected component of $H-E$ contains at most $|V(H)|/2$ vertices. Every directed graph $G$ naturally corresponds to an undirected graph $G_{0}$ defined as $V(G_{0})=V(G)$ and $E(G_{0})=\{\{u,v\}:(u,v)\in E(G)\text{ or }(v,u)\in E(G)\}$ . We need the following important result.

Lemma 2.

Let $G$ be a directed graph that is $k$ NN-realizable in $\mathbb{R}^{d}$ , and $G_{0}$ be the undirected graph corresponding to $G$ . Then any subgraph $H$ of $G_{0}$ with $|V(H)|\geq 2$ admits a balanced cut of size $O(|V(H)|^{1-\frac{1}{d}})$ , where the constant hidden in $O(\cdot)$ only depends on $k$ and $d$ .

Proof sketch.

Let $k,d\in\mathbb{N}$ be fixed numbers. The lemma follows from two results in [21]. Specifically, it was shown in [21] that if a graph $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ , then its corresponding undirected graph $G_{0}$ satisfies the following conditions.

1.

The degree of every vertex in $G_{0}$ is $O(1)$ .
2.

For every subgraph $H$ of $G_{0}$ , there exists $S\subseteq V(H)$ such that $|S|=O(|V(H)|^{1-\frac{1}{d}})$ and every connected component of $H-S$ contains at most $\frac{d+1}{d+2}\cdot|V(H)|$ vertices.

We remark that [21] only claimed condition 2 above for the case $H=G_{0}$ , but the argument extends to any subgraph $H$ of $G_{0}$ . Using the two conditions above, it is fairly easy to construct the desired balanced cut for any subgraph of $G_{0}$ . $\hfill\blacktriangleleft$

A minimum balanced cut refers to a balanced cut consisting of the minimum number of edges. A $c$ -approximate minimum balanced cut refers to a balanced cut whose size is at most $c\cdot\mathsf{opt}$ , where $\mathsf{opt}$ is the size of a minimum balanced cut. The above lemma implies that if $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ , then the size of a minimum balanced cut of any subgraph $H$ of $G_{0}$ is $O(|V(H)|^{1-\frac{1}{d}})$ . We need the following algorithm by Chuzhoy et al. for computing approximate minimum balanced cuts.

Theorem 3 ([10]).

For any fixed number $\alpha>0$ , there exists an algorithm that, given an undirected graph $H$ of $n$ vertices and $m$ edges, computes a $\log^{O(1/\alpha)}n$ -approximate minimum balanced cut of $H$ in $O(m^{1+\alpha})$ time.

The following lemma is the main result of this section, which states that one can remove a small fraction of edges from $G$ to split $G$ into small components.

Lemma 4.

Let $k,d\in\mathbb{N}$ be fixed numbers. Given a $k$ -regular directed graph $G$ and a number $\varepsilon\in(0,1]$ , one can either compute a subset $E\subseteq E(G)$ such that $|E|\leq\varepsilon|E(G)|$ and each weakly-connected component of $G-E$ contains at most $(\frac{1}{\varepsilon})^{O(1)}$ vertices, or conclude that $G$ is not $k$ NN-realizable in $\mathbb{R}^{d}$ , in $O(|V(G)|^{1+\alpha})$ time for any constant $\alpha>0$ . Here the constants hidden in $O(\cdot)$ depend on $k$ , $d$ , and $\alpha$ .

Proof.

Let $G_{0}$ be the corresponding undirected graph of $G$ , and $\alpha>0$ be a constant. It suffices to compute $E\subseteq E(G_{0})$ such that $|E|\leq\varepsilon|E(G_{0})|$ and each connected component of $G_{0}-E$ contains at most $(\frac{1}{\varepsilon})^{O(1)}$ vertices. Indeed, the edges in $G$ corresponding to $E$ form a subset of $E(G)$ of size $O(\varepsilon|E(G)|)$ which satisfies the desired property. This is fine since our algorithm works for an arbitrary $\varepsilon>0$ .

Pick a constant $\alpha^{\prime}\in(0,\alpha)$ and let $\textsc{ApprxMinCut}(H)$ denote the algorithm in Theorem 3, which returns an approximate minimum balanced cut of $H$ in $O(|V(H)|^{1+\alpha^{\prime}})$ time. Our algorithm, presented in Algorithm 1, computes $E\subseteq E(G_{0})$ by recursively applying ApprxMinCut as a sub-routine. We fix some threshold $\delta=(\frac{1}{\varepsilon})^{c}$ for a sufficiently large $c$ (depending on $k$ , $d$ , and $\alpha$ ). If $|V(G_{0})|\leq\delta$ , then we simply return $E=\emptyset$ . Otherwise, we apply $\textsc{ApprxMinCut}(G_{0})$ to obtain an approximate minimum balanced cut of $G_{0}$ , and add the edges in the cut to $E$ . Then for every connected component $C$ of $G_{0}-E$ , we recursively apply our algorithm on $C$ , which returns a subset of $E_{C}\subseteq E(C)$ , and we add the edges in $E_{C}$ to $E$ . Finally, we return $E$ as the output of our algorithm.

Algorithm 1 CutEdge

(G_{0})

.

Clearly, every connected component of $G_{0}-E$ contains at most $\delta=(\frac{1}{\varepsilon})^{O(1)}$ vertices. So it suffices to show $|E|\leq\varepsilon|E(G_{0})|$ and analyze the running time of the algorithm. To bound $|E|$ , we observe that when applying ApprxMinCut on any subgraph $H$ of $G_{0}$ , the size of the edge set obtained is of size $|V(H)|^{1-\frac{1}{d}}\log^{O(1/\alpha^{\prime})}|V(H)|$ . Indeed, a minimum balanced cut of $H$ has size $O(|V(H)|^{1-\frac{1}{d}})$ by Lemma 2, and $\textsc{ApprxMinCut}(H)$ returns a $\log^{O(1/\alpha^{\prime})}|V(H)|$ -approximate minimum balanced cut of $H$ . Since we defined $\delta=(\frac{1}{\varepsilon})^{c}$ for a sufficiently large $c$ , we may assume that the size of $\textsc{ApprxMinCut}(H)$ is at most $|V(H)|^{1-\frac{1}{2d}}$ for any subgraph $H$ of $G_{0}$ such that $|V(H)|>\delta$ . Also, we may assume that $\varepsilon(2^{\frac{1}{3d}}-1)n^{1-\frac{1}{3d}}>n^{1-\frac{1}{2d}}$ for all $n>\delta$ . We shall prove that when applying our algorithm on a subgraph $H$ of $G_{0}$ with $|V(H)|=n$ , the size of $E\subseteq E(H)$ returned is at most $\varepsilon(n-n^{1-\frac{1}{3d}})$ . We apply induction on $n$ . If $V(H)\leq\delta$ , then our algorithm returns an empty set and thus the statement trivially holds. Assume the statement holds for $|V(H)|\in[n-1]$ , and we consider the case $|V(H)|=n$ (where $n>\delta$ ). Our algorithm applies $\textsc{ApprxMinCut}(H)$ to obtain an approximate minimum balanced cut $E_{0}\subseteq E(H)$ of $H$ . As aforementioned, $|E_{0}|\leq n^{1-\frac{1}{2d}}$ . Let $C_{1},\dots,C_{r}$ be the connected components of $H-E_{0}$ . Set $n_{i}=|V(C_{i})|$ for $i\in[r]$ . We have $n_{i}\leq n/2$ for all $i\in[r]$ , by the definition of a balanced cut. We recursively call our algorithm on each $C_{i}$ to obtain a subset $E_{i}\subseteq E(C_{i})$ . By our induction hypothesis, $|E_{i}|\leq\varepsilon(n_{i}-n_{i}^{1-\frac{1}{3d}})$ . Finally, the algorithm returns $E=\bigcup_{i=0}^{r}E_{i}$ . Thus, we have $|E|=|E_{0}|+\sum_{i=1}^{r}|E_{i}|$ . As $\sum_{i=1}^{r}n_{i}=n$ , $\sum_{i=1}^{r}|E_{i}|\leq\varepsilon n-\varepsilon\sum_{i=1}^{r}n_{i}^{1-\frac% {1}{3d}}$ . Furthermore, since $n_{i}\leq n/2$ for all $i\in[r]$ and $\sum_{i=1}^{r}n_{i}=n$ , it holds $\sum_{i=1}^{r}n_{i}^{1-\frac{1}{3d}}\geq 2(\frac{n}{2})^{1-\frac{1}{3d}}=2^{% \frac{1}{3d}}n^{1-\frac{1}{3d}}$ . Combining the bounds for $|E_{0}|$ and $\sum_{i=1}^{r}|E_{i}|$ , we have

|E|\leq n^{1-\frac{1}{2d}}+\varepsilon n-\varepsilon\cdot 2^{\frac{1}{3d}}n^{1% -\frac{1}{3d}}.

Recall our assumption $\varepsilon(2^{\frac{1}{3d}}-1)n^{1-\frac{1}{3d}}\geq n^{1-\frac{1}{2d}}$ . Together with the inequality above, this gives us $|E|\leq\varepsilon(n-n^{1-\frac{1}{3d}})$ , so our induction works. In particular, when we apply Algorithm 1 on $G_{0}$ , the output $E\subseteq E(G_{0})$ satisfies $|E|\leq\varepsilon(|V(G_{0})|-|V(G_{0})|^{1-\frac{1}{3d}})\leq\varepsilon|V(G_% {0})|$ .

To analyze the time complexity of Algorithm 1, we consider the recursion tree $T$ of our algorithm when applying on $G_{0}$ . Each node $t\in T$ corresponds to a recursive call, and denote by $G_{0}(t)$ the corresponding graph handled in that call (which is a subgraph of $G_{0}$ ). The time cost at a node $t\in T$ is $O(|E(G_{0}(t))|^{1+\alpha^{\prime}})$ , which is just $O(|V(G_{0}(t))|^{1+\alpha^{\prime}})$ because $|E(G_{0}(t))|\leq k|V(G_{0}(t))|$ . Since ApprxMinCut always returns a balanced cut, we know that $|V(G_{0}(t))|\leq|V(G_{0}(t^{\prime}))|/2$ if $t^{\prime}$ is the parent of $t$ in $T$ . This implies that the depth of $T$ is $O(\log|V(G_{0})|)$ . Also, the sum of $|V(G_{0}(t))|$ for all nodes $t\in T$ at one level of $T$ is at most $|V(G_{0})|$ . Thus $\sum_{t\in T}|V(G_{0}(t))|\leq|V(G_{0})|\log|V(G_{0})|$ and $\sum_{t\in T}|V(G_{0}(t))|^{1+\alpha^{\prime}}\leq|V(G_{0})|^{1+\alpha^{\prime% }}\log|V(G_{0})|$ . The latter implies that the time complexity of Algorithm 1 is $O(|V(G_{0})|^{1+\alpha^{\prime}}\log|V(G_{0})|)$ , which is $O(|V(G_{0})|^{1+\alpha})$ since $\alpha>\alpha^{\prime}$ . We remark that a more careful analysis can be applied to show that the running time of our algorithm is actually $O(|V(G_{0})|^{1+\alpha^{\prime}})$ instead of $O(|V(G_{0})|^{1+\alpha^{\prime}}\log|V(G_{0})|)$ . But for simplicity, we chose this looser analysis, which is already sufficient for our purpose. $\hfill\blacktriangleleft$

2.2 Computing the approximate realization

After computing the set $E\subseteq E(G)$ of edges using Lemma 4, we consider the graph $G-E$ . Let $C_{1},\dots,C_{r}$ be the weakly-connected components of $G-E$ . We have $|V(C_{i})|=(\frac{1}{\varepsilon})^{O(1)}$ by Lemma 4. For each $i\in[r]$ , we want to compute a “ $k$ NN-realization” of $C_{i}$ in $\mathbb{R}^{d}$ . Of course, here each $C_{i}$ is not necessarily $k$ -regular, and thus a $k$ NN-realization of $C_{i}$ is not defined. But we can slightly generalize the definition of $k$ NN-realization as follows. For a directed graph $C$ , we say a map $\phi:V(C)\rightarrow\mathbb{R}^{d}$ is a quasi- $k$ NN-realization of $C$ in $\mathbb{R}^{d}$ if for every edge $(u,v)\in E(C)$ , $|\{v^{\prime}\in V(C)\backslash\{u\}:\lVert\phi(u)-\phi(v^{\prime})\rVert_{2}% \leq\lVert\phi(u)-\phi(v)\rVert_{2}\}|\leq k$ . By this definition, a $k$ NN-realization is also a quasi- $k$ NN-realization. Also, it is clear that if $\phi:V(H)\rightarrow\mathbb{R}^{d}$ is a quasi- $k$ NN-realization of a directed graph $H$ in $\mathbb{R}^{d}$ , then for any subgraph $C$ of $H$ , the map $\phi_{|V(C)}$ is a quasi- $k$ NN-realization of $C$ in $\mathbb{R}^{d}$ . Therefore, if $G$ is $k$ NN-realizable in $\mathbb{R}^{d}$ , then each of $C_{1},\dots,C_{r}$ admits a quasi- $k$ NN-realization in $\mathbb{R}^{d}$ .

Next, we discuss how to compute a quasi- $k$ NN-realization of each $C_{i}$ in $\mathbb{R}^{d}$ (or conclude it does not exist). To this end, we need the following lemma.

Lemma 5.

Let $C$ be a directed graph where $|V(C)|\geq k+1$ . If $C$ admits a quasi- $k$ NN-realization in $\mathbb{R}^{d}$ , then there exists a $k$ -regular supergraph $C^{\prime}$ of $C$ with $V(C^{\prime})=V(C)$ such that $C^{\prime}$ is $k$ NN-realizable in $\mathbb{R}^{d}$ .

Proof.

Assume $\phi:V(C)\rightarrow\mathbb{R}^{d}$ is a quasi- $k$ NN-realization of $C$ in $\mathbb{R}^{d}$ . We can slightly perturb $\phi$ so that for any distinct $u,v,v^{\prime}\in V(C)$ , $\lVert\phi(u)-\phi(v^{\prime})\rVert_{2}\neq\lVert\phi(u)-\phi(v)\rVert_{2}\}$ . Now define $C^{\prime}$ as a directed graph with $V(C^{\prime})=V(C)$ and $E(C^{\prime})=\{(u,v)\in V(C^{\prime})\times V(C^{\prime}):u\neq v\text{ and }% \mathsf{rank}_{\phi}(u,v)\leq k\}$ , where

\mathsf{rank}_{\phi}(u,v)=|\{v^{\prime}\in V(C^{\prime})\backslash\{u\}:\lVert% \phi(u)-\phi(v^{\prime})\rVert_{2}\leq\lVert\phi(u)-\phi(v)\rVert_{2}\}|.

The construction guarantees that $C^{\prime}$ is $k$ -regular and $\phi$ is a $k$ NN-realization of $C^{\prime}$ in $\mathbb{R}^{d}$ . Since $\phi$ is a quasi- $k$ NN-realization of $C$ , $E(C)\subseteq E(C^{\prime})$ and thus $C^{\prime}$ is a supergraph of $C$ . $\hfill\blacktriangleleft$

If $|V(C_{i})|\leq k$ , then any map from $V(C_{i})$ to $\mathbb{R}^{d}$ is a quasi- $k$ NN-realization of $C_{i}$ . Otherwise, by the above lemma, to compute a quasi- $k$ NN-realization of $C_{i}$ in $\mathbb{R}^{d}$ , it suffices to consider every supergraph $C_{i}^{\prime}$ of $C_{i}$ with $V(C_{i}^{\prime})=V(C_{i})$ and try to compute a $k$ NN-realization of $C_{i}^{\prime}$ (or conclude that $C_{i}^{\prime}$ is not $k$ NN-realizable). Note that the number of such supergraphs is at most $\exp({(\frac{1}{\varepsilon})^{O(1)}})$ because $|V(C_{i})|=(\frac{1}{\varepsilon})^{O(1)}$ . If we obtain a $k$ NN-realization of some $C_{i}^{\prime}$ , then it is a quasi- $k$ NN-realization of $C_{i}$ . To compute a $k$ NN-realization $\phi:V(C_{i}^{\prime})\rightarrow\mathbb{R}^{d}$ of $C_{i}^{\prime}$ , we formulate the problem as finding a solution to a system of $(\frac{1}{\varepsilon})^{O(1)}$ degree-2 polynomial inequalities on $(\frac{1}{\varepsilon})^{O(1)}$ variables. For each $v\in V(C_{i}^{\prime})$ , we represent the coordinates of $\phi(v)$ by $d$ variables $x_{1}(v),\dots,x_{d}(v)$ . Then for all distinct $u,v,v^{\prime}\in V(C_{i}^{\prime})$ such that $(u,v)\in E(C_{i}^{\prime})$ and $(u,v^{\prime})\notin E(C_{i}^{\prime})$ , we introduce a degree-2 polynomial inequality $\sum_{j=1}^{d}(x_{j}(u)-x_{j}(v))^{2}\leq\sum_{j=1}^{d}(x_{j}(u)-x_{j}(v^{% \prime}))^{2}$ , which expresses $\lVert\phi(u)-\phi(v)\rVert_{2}<\lVert\phi(u)-\phi(v^{\prime})\rVert_{2}$ . Clearly, the solutions to this system of inequalities one-to-one correspond to the $k$ NN-realizations of $C_{i}^{\prime}$ in $\mathbb{R}^{d}$ . Renegar [23] showed that a system of $p$ degree-2 polynomial inequalities on $q$ variables can be solved in $p^{O(q)}$ time. Therefore, we can compute in $(\frac{1}{\varepsilon})^{(\frac{1}{\varepsilon})^{O(1)}}$ time a $k$ NN-realization of $C_{i}^{\prime}$ in $\mathbb{R}^{d}$ (or decide its non-existence). This further implies that we can compute in $(\frac{1}{\varepsilon})^{(\frac{1}{\varepsilon})^{O(1)}}$ time a quasi- $k$ NN-realization of $C_{i}$ in $\mathbb{R}^{d}$ (or decide its non-existence). The total time cost for all $C_{i}$ is then $(\frac{1}{\varepsilon})^{(\frac{1}{\varepsilon})^{O(1)}}\cdot n$ , since $r\leq n$ .

If $C_{i}$ does not admit a quasi- $k$ NN-realization in $\mathbb{R}^{d}$ for some $i\in[r]$ , then we can directly conclude that $G$ is not $k$ NN-realizable in $\mathbb{R}^{d}$ . Otherwise we construct a $(1-\varepsilon)$ -approximate $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ as follows. Let $\phi_{i}:V(C_{i})\rightarrow\mathbb{R}^{d}$ be the quasi- $k$ NN-realization of $C_{i}$ in $\mathbb{R}^{d}$ we compute, for $i\in[r]$ . Define a map $\phi:V(G)\rightarrow\mathbb{R}^{d}$ as follows. Pick a vector $\vec{x}\in\mathbb{R}^{d}$ such that $\lVert\vec{x}\rVert_{2}>>\max_{i\in[r]}\max_{u,v\in V(C_{i})}\lVert\phi_{i}(u)% -\phi_{i}(v)\rVert_{2}$ . Then for each $i\in[r]$ and each $v\in V(C_{i})$ , set $\phi(v)=i\vec{x}+\phi_{i}(v)$ . The choice of $\vec{x}$ guarantees that for a vertex $v\in V(C_{i})$ , the $\phi$ -images of the vertices in $C_{i}$ are closer to $\phi(v)$ than the $\phi$ -images of the vertices outside $C_{i}$ . Also, for any $u,v\in V(C_{i})$ , $\lVert\phi(u)-\phi(v)\rVert_{2}=\lVert\phi_{i}(u)-\phi_{i}(v)\rVert_{2}$ . Therefore, for any $u,v\in V(C_{i})$ , we have

		$\displaystyle\ \{v^{\prime}\in V(G)\backslash\{u\}:\lVert\phi(u)-\phi(v^{% \prime})\rVert_{2}\leq\lVert\phi(u)-\phi(v)\rVert_{2}\}$
	$\displaystyle=$	$\displaystyle\ \{v^{\prime}\in V(C_{i})\backslash\{u\}:\lVert\phi(u)-\phi(v^{% \prime})\rVert_{2}\leq\lVert\phi(u)-\phi(v)\rVert_{2}\}$
	$\displaystyle=$	$\displaystyle\ \{v^{\prime}\in V(C_{i})\backslash\{u\}:\lVert\phi_{i}(u)-\phi_% {i}(v^{\prime})\rVert_{2}\leq\lVert\phi_{i}(u)-\phi_{i}(v)\rVert_{2}\}.$

Recall the function $\sigma_{\phi}:V(G)\times V(G)\rightarrow\{0,1\}$ in Definition 1. The above equality shows that for $u,v\in V(C_{i})$ , if $|\{v^{\prime}\in V(C_{i})\backslash\{u\}:\lVert\phi_{i}(u)-\phi_{i}(v^{\prime}% )\rVert_{2}\leq\lVert\phi_{i}(u)-\phi_{i}(v)\rVert_{2}\}|\leq k$ , then $\sigma_{\phi}(u,v)=1$ . It follows that $\sigma_{\phi}(u,v)=1$ for any $(u,v)\in E(C_{i})$ because $\phi_{i}$ is a quasi- $k$ NN-realization of $C_{i}$ , so we have

\sum_{(u,v)\in E(G)}\sigma_{\phi}(u,v)\geq\sum_{i=1}^{r}|E(C_{i})|=|E(G)|-|E|% \geq(1-\varepsilon)\cdot|E(G)|.

Therefore, $\phi$ is a $(1-\varepsilon)$ -approximate $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ . Combining the time costs for computing $E$ and the maps $\phi_{1},\dots,\phi_{r}$ , the overall time complexity of our algorithm is $O(n^{1+\alpha})$ , where $O(\cdot)$ hides a constant depending on $k$ , $d$ , $\alpha$ , and $\varepsilon$ .

Theorem 6.

Let $\alpha>0$ be any fixed number. Given a $k$ -regular directed graph $G$ of $n$ vertices and a number $\varepsilon>0$ , one can compute in $f(k,d,\varepsilon)\cdot n^{1+\alpha}$ time a $(1-\varepsilon)$ -approximate $k$ NN-realization of $G$ in $\mathbb{R}^{d}$ , or conclude that $G$ is not $k$ NN-realizable in $\mathbb{R}^{d}$ , where $f$ is some computable function depending on $\alpha$ .

3 $𝒌$ NN-Realization in $\mathbb{R}^{1}$

To the best of our knowledge, the $k$ NN-realization problem on the line does not seem to have been studied, and it is the focus of this section. A number of line embedding problems are $N P$ -hard as mentioned earlier, including triplet constraints and betweenness problems [9]. In the former, we are given a set of triplet constraints of the form $d(a,b)<d(a,c)$ , while in the latter each constraint $(a,b,c)$ requires the ordering to satisfy $a<b<c$ . Given the hardness result, the focus in these problems is on approximating the maximum number of satisfied constraints in the linear ordering [17].

Unlike these intractable embedding problems on the line, we show that the partial order constraints implied by a $k$ NN-realization problem have sufficiently rich structure to admit a polynomial time solution. Most of the difficulty is in deciding whether a $k$ -regular directed graph $G$ is $k$ NN-realizable on the line; if the answer is yes, then one can easily compute the embedding in polynomial time using linear programming. (It is also worth pointing out that the decision problem for $k$ NN-realization on the line is significantly more complicated than the special case of triplet constraints in [17] when all $\binom{n}{3}$ triples are specified. Indeed, in that case, one can guess the leftmost point, and then all the remaining points are immediately determined by the triplet constraints. This is not the case in $k$ NN-realization: fixing the leftmost point does not fix its neighbors’ order.)

The decision version of the $k$ NN-realization problem, of course, is easy if we are given a permutation of $G$ ’s vertices $(v_{1},\dots,v_{n})$ . In this case, we can easily compute a $k$ NN-realization $\phi:V(G)\rightarrow\mathbb{R}^{1}$ satisfying $\phi(v_{1})<\cdots<\phi(v_{n})$ , or decide that a feasible realization does not exist, by formulating the problem as a linear program (LP) with $n$ variables $x_{1},\dots,x_{n}$ and the following set of constraints

$\blacksquare$

$x_{i}\leq x_{j}$ for all $i,j\in[n]$ with $i\leq j$ ,
$\blacksquare$

$\Delta_{i,j}<\Delta_{i,k}$ for all distinct $i,j,k\in[n]$ such that $(v_{i},v_{j})\in E(G)$ and $(v_{i},v_{k})\notin E(G)$ , where $\Delta_{i,j}=x_{\max\{i,j\}}-x_{\min\{i,j\}}$ and $\Delta_{i,k}=x_{\max\{i,k\}}-x_{\min\{i,k\}}$ .

Clearly, if $x_{1},\dots,x_{n}$ satisfy these constraints, then setting $\phi(v_{i})=x_{i}$ gives us the desired realization. On the other hand, if $\phi$ is the realization we want, then the numbers $x_{1},\dots,x_{n}$ where $x_{i}=\phi(v_{i})$ must satisfy the constraints. Therefore, we begin with this key problem: efficiently computing an ordering of the images of the vertices on $\mathbb{R}^{1}$ .

3.1 Finding the vertex ordering

We say an ordering $(v_{1},\dots,v_{n})$ of $V(G)$ is a feasible vertex ordering of $G$ if there exists a $k$ NN-realization $\phi:V(G)\rightarrow\mathbb{R}^{1}$ of $G$ in $\mathbb{R}^{1}$ satisfying $\phi(v_{1})<\cdots<\phi(v_{n})$ . The goal of this section is to give an $O(kn)$ -time algorithm for computing a feasible vertex ordering of $G$ (assuming it exists). We may assume, without loss of generality, that $G$ is weakly-connected; otherwise we can consider each weakly-connected component of $G$ individually.

Our first observation states two key properties of a feasible vertex ordering: (1) for each $v_{i}$ its $k$ out-neighbors form a contiguous block, and for different $v_{i}$ ’s their blocks have the same linear ordering as the vertex ordering, and (2) for each $v_{i}$ its in-neighbors also form a continuous block, which extends at most $k$ vertices to the left and at most $k$ to the right of $v_{i}$ . (Recall that both $\mathsf{out}[v]$ and $\mathsf{in}[v]$ include $v$ , for all vertices $v\in V(G)$ .) See figure 1.

Observation 7.

A feasible vertex ordering $(v_{1},\dots,v_{n})$ of $G$ satisfies the following.

1.

There exist $p_{1},\dots,p_{n}\in[n-k]$ such that (i) $\mathsf{out}[v_{i}]=\{v_{p_{i}},\dots,v_{p_{i}+k}\}$ for all $i\in[n]$ , (ii) $p_{1}\leq\cdots\leq p_{n}$ , and (iii) $p_{i}\leq i\leq p_{i}+k$ for all $i\in n$ .
2.

There exist $q_{1},\dots,q_{n},q_{1}^{\prime},\dots,q_{n}^{\prime}\in[n]$ such that (i) $\mathsf{in}[v_{i}]=\{v_{q_{i}},\dots,v_{q_{i}^{\prime}}\}$ for all $i\in[n]$ , (ii) $q_{1}\leq\cdots\leq q_{n}$ , (iii) $q_{1}^{\prime}\leq\cdots\leq q_{n}^{\prime}$ , and (iv) $q_{i}\leq q_{i}^{\prime}\leq q_{i}+2k$ for all $i\in[n]$ .

Proof.

Let $\phi:V(G)\rightarrow\mathbb{R}^{1}$ be a $k$ NN-realization of $G$ in $\mathbb{R}^{1}$ satisfying $\phi(v_{1})<\cdots<\phi(v_{n})$ . Clearly, for each $i\in[n]$ , the $k+1$ points in $\{\phi(v_{1}),\dots,\phi(v_{n})\}$ closest to $\phi(v_{i})$ are $\{\phi(v_{p_{i}}),\dots,\phi(v_{p_{i}+k})\}$ for some $p_{i}\in[n-k]$ such that $p_{i}\leq i\leq p_{i}+k$ . Furthermore, one can easily verify that $p_{1}\leq\cdots\leq p_{n}$ . Assume $p_{i}>p_{j}$ for some $i,j\in[n]$ with $i<j$ . Then we have $\phi(v_{p_{j}})<\phi(v_{p_{i}})\leq\phi(v_{i})<\phi(v_{j})\leq\phi(v_{p_{j}+k}% )<\phi(v_{p_{i}+k})$ . On the other hand, since $v_{p_{i}+k}\notin\{\phi(v_{p_{j}}),\dots,\phi(v_{p_{j}+k})\}$ and $v_{p_{j}}\notin\{\phi(v_{p_{i}}),\dots,\phi(v_{p_{i}+k})\}$ ,

\phi(v_{j})-\phi(v_{p_{j}})\leq\phi(v_{p_{i}+k})-\phi(v_{j})\leq\phi(v_{p_{i}+% k})-\phi(v_{i})\leq\phi(v_{i})-\phi(v_{p_{j}}),

which contradicts the fact $\phi(v_{p_{j}})<\phi(v_{p_{i}})\leq\phi(v_{i})<\phi(v_{j})$ . By the definition of $k$ NN-realization, we see $\mathsf{out}[v_{i}]=\{v_{p_{i}},\dots,v_{p_{i}+k}\}$ for all $i\in[n]$ . This proves condition 1.

Condition 2 follows from condition 1. Observe that for each $i\in[n]$ , it holds that $\mathsf{in}[v_{i}]=\{v_{j}:p_{j}\leq i\leq p_{j}+k\}=\{v_{j}:i-k\leq p_{j}\leq i\}$ . Since $p_{1}\leq\cdots\leq p_{n}$ , this implies $\mathsf{in}[v_{i}]=\{v_{q_{i}},\dots,v_{q_{i}^{\prime}}\}$ where $q_{i}=\min\{j:i-k\leq p_{j}\leq i\}$ and $q_{i}^{\prime}=\max\{j:i-k\leq p_{j}\leq i\}$ . The monotoncities $q_{1}\leq\cdots\leq q_{n}$ and $q_{1}^{\prime}\leq\cdots\leq q_{n}^{\prime}$ follow directly from the monotoncity $p_{1}\leq\cdots\leq p_{n}$ . It suffices to show that $q_{i}\leq q_{i}^{\prime}\leq q_{i}+2k$ for all $i\in[n]$ . The inequality $q_{i}\leq q_{i}^{\prime}$ is trivial. To see $q_{i}^{\prime}\leq q_{i}+2k$ , assume that $q_{i}^{\prime}>q_{i}+2k$ . Then there exists $j,j^{\prime}\in[n]$ with $j^{\prime}>j+2k$ such that $i-k\leq p_{j}\leq i$ and $i-k\leq p_{j^{\prime}}\leq i$ . By condition 1, we have $j\geq p_{j}$ and $p_{j^{\prime}}\geq j^{\prime}-k>j+k$ . Therefore, $p_{j^{\prime}}>j+k\geq p_{j}+k\geq(i-k)+k=i$ , which contradicts the fact $i-k\leq p_{j^{\prime}}\leq i$ . This proves condition 2. $\hfill\blacktriangleleft$

Figure 1: A

2

-regular graph on

6

vertices and its realization (top). The values of

p,p^{\prime},q,q^{\prime}

for each vertex illustrate a natural ordering of the in and out-neighborhoods corresponding to the feasible vertex ordering

v_{1},\cdots,v_{6}

(bottom).

Consider the equivalence relation $\sim$ defined on $V(G)$ as $u\sim v$ iff $\mathsf{out}[u]=\mathsf{out}[v]$ . Let $\mathcal{C}_{G}$ be the set of equivalence classes of this relation, which is a partition of $V(G)$ into groups of vertices with the same set of out neighbors. One can easily compute $\mathcal{C}_{G}$ in $O(k^{2}n)$ time as follows. Given a vertex $v\in V(G)$ , we check whether $\mathsf{out}[u]=\mathsf{out}[v]$ , for all $u\in\mathsf{out}[v]$ , which takes $O(k^{2})$ time since $|\mathsf{out}[v]|=k+1$ and each equality test takes $O(k)$ time. Thus, the time for computing all classes is $O(k^{2}n)$ . Interestingly, we use Observation 7 to compute $\mathcal{C}_{G}$ in $O(kn)$ time, if $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ , as shown in the following Lemma. Due to space constraints the proof is included in the full version.

Lemma 8.

Let $G$ be a $k$ -regular directed graph of $n$ vertices. One can compute in $O(kn)$ time a partition $\mathcal{C}$ of $V(G)$ such that $\mathcal{C}=\mathcal{C}_{G}$ if $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ .

Write $r=|\mathcal{C}_{G}|$ . Let $(v_{1},\dots,v_{n})$ be a feasible vertex ordering of $G$ . Observation 7 implies that each class $C\in\mathcal{C}_{G}$ is a set of consecutive vertices in the sequence $(v_{1},\dots,v_{n})$ , i.e., $C=\{v_{\alpha},\dots,v_{\beta}\}$ for some $\alpha,\beta\in[n]$ . Define $\mathsf{out}[C]=\mathsf{out}[v]$ for an arbitrary vertex $v\in C$ . Therefore, there is a natural ordering $(C_{1},\dots,C_{r})$ of the classes in $\mathcal{C}_{G}$ , in which the indices of the vertices in $C_{i}$ are smaller than the indices of the vertices in $C_{j}$ for all $i,j\in[r]$ with $i<j$ . We call $(C_{1},\dots,C_{r})$ a feasible class ordering of $G$ .

Observation 9.

If $(C_{1},\dots,C_{r})$ is the feasible class ordering of $G$ corresponding to a feasible vertex ordering $(v_{1},\dots,v_{n})$ of $G$ , then there exist $c_{1},\dots,c_{r}\in[n-k]$ satisfying

$\blacksquare$

$\mathsf{out}[C_{i}]=\{v_{c_{i}},\dots,v_{c_{i}+k}\}$ for all $i\in[r]$ ,
$\blacksquare$

$c_{1}<\cdots<c_{r}$ ,
$\blacksquare$

if $G$ is weakly-connected, then $c_{i+1}\leq c_{i}+k$ for all $i\in[r-1]$ .

Proof.

Let $p_{1},\dots,p_{n}\in[n-k]$ be the indices in condition 1 of Observation 7, for the feasible vertex ordering $(v_{1},\dots,v_{n})$ . For each $i\in[r]$ , set $c_{i}=p_{s}$ where $s=1+\sum_{j=1}^{i-1}|C_{j}|$ , and we then have $\mathsf{out}[C_{i}]=\mathsf{out}[v_{s}]=\{v_{c_{i}},\dots,v_{c_{i}+k}\}$ . Since $p_{1}\leq\cdots\leq p_{n}$ , we have $c_{1}\leq\cdots\leq c_{r}$ . Furthermore, for all $i\in[r-1]$ , we have $c_{i}\neq c_{i+1}$ , simply because $\mathsf{out}[C_{i}]\neq\mathsf{out}[C_{i+1}]$ . Thus, $c_{1}<\cdots<c_{r}$ . To see the last property, assume $c_{i+1}>c_{i}+k$ for some $i\in[r-1]$ . Then $(\bigcup_{j=1}^{i}\mathsf{out}[C_{j}])\cap(\bigcup_{j=i+1}^{r}\mathsf{out}[C_{% j}])=\emptyset$ . Note that $\bigcup_{j=1}^{i}C_{j}\subseteq\bigcup_{j=1}^{i}\mathsf{out}[C_{j}]$ and $\bigcup_{j=i+1}^{r}C_{j}\subseteq\bigcup_{j=i+1}^{r}\mathsf{out}[C_{j}]$ . So there is no edge between $\bigcup_{j=1}^{i}C_{j}$ and $\bigcup_{j=i+1}^{r}C_{j}$ , which implies that $G$ is not weakly connected. $\hfill\blacktriangleleft$

To compute a feasible vertex ordering of $G$ , we first compute a feasible class ordering, using the previous observation. Due to space constraints the proof of the following lemma is included in the full version.

Lemma 10.

Let $G$ be a weakly-connected $k$ -regular directed graph of $n$ vertices. Given $\mathcal{C}_{G}$ , one can compute in $O(kn)$ time an ordering of $\mathcal{C}_{G}$ , which is a feasible class ordering of $G$ if $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ .

Next, we show how to compute a corresponding feasible vertex ordering of $G$ given a feasible class ordering $(C_{1},\dots,C_{r})$ . By definition, we know that in the feasible vertex ordering, the vertices in $C_{i}$ must appear before the vertices in $C_{j}$ for all $i,j\in[r]$ with $i<j$ . So it suffices to figure out the “local” ordering of the vertices in each class $C_{i}$ . We do this by considering the in-neighbors of each vertex. As observed before, for each $v\in V(G)$ , $\mathsf{in}[v]$ is the union of several classes in $\mathcal{C}_{G}$ , and in addition, $\mathsf{in}[v]=\bigcup_{i=p}^{q}C_{i}$ for some $p,q\in[r]$ , by condition 2 of Observation 7. It turns out that we can sort the vertices in each class according to the values of $p$ and $q$ . Formally, we prove the following lemma.

Lemma 11.

Let $G$ be a weakly-connected $k$ -regular directed graph of $n$ vertices, and $r=|\mathcal{C}_{G}|$ . Given an ordering $(C_{1},\dots,C_{r})$ of $\mathcal{C}_{G}$ , one can compute in $O(kn)$ time an ordering $(v_{1},\dots,v_{n})$ of $V(G)$ such that if $(C_{1},\dots,C_{r})$ is a feasible class ordering of $G$ , then $(v_{1},\dots,v_{n})$ is a feasible vertex ordering of $G$ .

Proof.

Our algorithm for computing a feasible vertex ordering of $G$ is presented in Algorithm 2. First, for each $v\in V(G)$ , we compute a pair $R(v)=(p,q)$ where $p\in[r]$ (resp., $q\in[r]$ ) is the minimum (resp., maximum) index such that $C_{p}\subseteq\mathsf{in}[v]$ (resp., $C_{q}\subseteq\mathsf{in}[v]$ ). We define a partial order $\preceq$ on index-pairs by setting $(p,q)\preceq(p^{\prime},q^{\prime})$ if $p+q\leq p^{\prime}+q^{\prime}$ . Then we order the vertices in each class $C_{i}$ as $(u_{1},\dots,u_{|C_{i}|})$ such that $R(u_{1})\preceq\cdots\preceq R(u_{|C_{i}|})$ . If there exist $u,v\in C_{i}$ such that $R(u)$ and $R(v)$ are incomparable under the order $\preceq$ , then we just arbitrarily order the vertices in $C_{i}$ . We will see later that in this case, $(C_{1},\dots,C_{r})$ is not a feasible class ordering of $G$ . By concatenating the orderings of $C_{1},\dots,C_{r}$ , we obtain an ordering $(v_{1},\dots,v_{n})$ of $V(G)$ .

We first show that Algorithm 2 can be implemented in $O(kn)$ time. Computing $R(v)$ for all $v\in V(G)$ can be done in $O(kn)$ time, because $|\mathsf{in}[v]|=O(k)$ by (ii) of Observation 7. To see the time cost for ordering the vertices in each $C_{i}$ , observe that $|C_{i}|\leq k$ . Indeed, $v\in\mathsf{out}[v]=\mathsf{out}[C_{i}]$ for all $v\in C_{i}$ , which implies $C_{i}\subseteq\mathsf{out}[C_{i}]$ and thus $|C_{i}|\leq|\mathsf{out}[C_{i}]|=k$ . If we sort the vertices in $C_{i}$ directly, it takes $O(k\log k)$ time. One can improve the time cost to $O(k)$ by observing that for any $v\in C_{i}$ , the pair $R(v)=(p,q)$ satisfies $p\geq i-k$ and $q\leq i+k$ , by condition 2 of Observation 7, which implies $2i-2k\leq p+q\leq 2i+2k$ . In other words, in the sorting task, the keys are all in the range $\{2i-2k,\dots,2i+2k\}$ , whose size is $O(k)$ . Thus, the task can be done in $O(k)$ time using bucket sorting. It follows that Algorithm 2 can be implemented in $O(kn)$ time.

To see the correctness of our algorithm, suppose $(C_{1},\dots,C_{r})$ is a feasible class ordering of $G$ , and let $(v_{1}^{*},\dots,v_{n}^{*})$ be a corresponding feasible vertex ordering of $G$ . We do not necessarily have $(v_{1},\dots,v_{n})=(v_{1}^{*},\dots,v_{n}^{*})$ . However, as we will see, it holds that $(R(v_{1}),\dots,R(v_{n}))=(R(v_{1}^{*}),\dots,R(v_{n}^{*}))$ , which turns out to be sufficient. By (ii) of Observation 7, for each $v\in V(G)$ with $R(v)=(p,q)$ , we have $\mathsf{in}[v]=\bigcup_{i=p}^{q}C_{i}$ , and furthermore, $R(v_{1}^{*})\preceq\cdots\preceq R(v_{n}^{*})$ . Now consider a class $C_{i}$ . Note that $C_{i}=\{v_{\alpha},\dots,v_{\beta}\}=\{v_{\alpha}^{*},\dots,v_{\beta}^{*}\}$ , where $\alpha=1+\sum_{j=1}^{i-1}|C_{j}|$ and $\beta=\sum_{j=1}^{i}|C_{j}|$ . We have $R(v_{\alpha}^{*})\preceq\cdots\preceq R(v_{\beta}^{*})$ , and our algorithm guarantees that $R(v_{\alpha})\preceq\cdots\preceq R(v_{\beta})$ . Therefore, $(R(v_{\alpha}),\dots,R(v_{\beta}))=(R(v_{\alpha}^{*}),\dots,R(v_{\beta}^{*}))$ . It then follows that $(R(v_{1}),\dots,R(v_{n}))=(R(v_{1}^{*}),\dots,R(v_{n}^{*}))$ .

To see $(v_{1},\dots,v_{n})$ is a feasible vertex ordering of $G$ , we further observe that the function $\pi:V(G)\rightarrow V(G)$ defined as $\pi(v_{i})=v_{i}^{*}$ for $i\in[n]$ is an automorphism of $G$ . Consider indices $i,j\in[n]$ . We have $\mathsf{out}[v_{i}]=\mathsf{out}[v_{i}^{*}]$ , since $v_{i}$ and $v_{i}^{*}$ belong to the same class in $\mathcal{C}_{G}$ . Thus, $(v_{i},v_{j})\in E(G)$ iff $(v_{i}^{*},v_{j})\in E(G)$ . On the other hand, because $R(v_{j})=R(v_{j}^{*})$ , we have $\mathsf{in}[v_{j}]=\mathsf{in}[v_{j}^{*}]$ and hence $(v_{i}^{*},v_{j})\in E(G)$ iff $(v_{i}^{*},v_{j}^{*})\in E(G)$ . As such, $(v_{i},v_{j})\in E(G)$ iff $(v_{i}^{*},v_{j}^{*})\in E(G)$ , which implies that $\pi$ is an automorphism of $G$ . Now consider a $k$ NN-realization $\phi:V(G)\rightarrow\mathbb{R}^{1}$ of $G$ in $\mathbb{R}^{1}$ satisfying $\phi(v_{1}^{*})<\cdots<\phi(v_{n}^{*})$ . Set $\phi^{\prime}=\phi\circ\pi$ , which is a map from $V(G)$ to $\mathbb{R}^{1}$ . As $\pi$ is an automorphism of $G$ , $\phi^{\prime}$ is also a $k$ NN-realization of $G$ . Furthermore, $\phi^{\prime}(v_{i})=\phi(\pi(v_{i}))=\phi(v_{i}^{*})$ for all $i\in[n]$ , which implies $\phi^{\prime}(v_{1})<\cdots<\phi^{\prime}(v_{n})$ . The existence of $\phi^{\prime}$ shows that $(v_{1},\dots,v_{n})$ is a feasible vertex ordering of $G$ . $\hfill\blacktriangleleft$

Algorithm 2 VertexOrder

(G,(C_{1},\dots,C_{r}))

.

Theorem 12.

Given a $k$ -regular directed graph $G$ of $n$ vertices, one can compute in $O(kn)$ time an ordering $(v_{1},\dots,v_{n})$ of $V(G)$ , which is a feasible vertex ordering of $G$ if $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ .

Proof.

For the case where $G$ is weakly-connected, the theorem follows directly from Lemmas 8, 10 and 11. Otherwise, we compute the orderings for the weakly-connected components of $G$ individually and concatenate them; this gives us the desired ordering of $V(G)$ . $\hfill\blacktriangleleft$

3.2 Deciding the realizability and computing the realization

To decide the $k$ NN-realizability of $G$ , we first run the algorithm of Theorem 12, which returns the vertex ordering $(v_{1},\dots,v_{n})$ of $V(G)$ , and then run the linear program given at the beginning of Section 3 for this ordering. Either the LP is feasible, in which case the solution gives a $k$ NN-realization of $G$ in $\mathbb{R}^{1}$ , or LP is infeasible, in which case we can conclude that $(v_{1},\dots,v_{n})$ is not a feasible vertex ordering and thus $G$ is not $k$ NN-realizable in $\mathbb{R}^{1}$ . Thus, the $k$ NN-realization problem in $\mathbb{R}^{1}$ can be solved in polynomial time.

Interestingly, we can show that if we are only interested in the decision problem (and not actual embedding), then solving the LP is not necessary. Specifically, we can decide whether $(v_{1},\dots,v_{n})$ is a feasible vertex by simply checking condition 1 of Observation 7. If the ordering satisfies that condition, then the LP is always feasible. The proof of the following lemma is somewhat technical and is presented in the full version.

Lemma 13.

Let $G$ be a $k$ -regular directed graph of $n$ vertices. An ordering $(v_{1},\dots,v_{n})$ of $V(G)$ is a feasible vertex ordering of $G$ iff it satisfies condition 1 of Observation 7. In particular, one can decide whether a given ordering of $V(G)$ is a feasible vertex ordering of $G$ or not in $O(kn)$ time.

We can now state the main result of this section.

Theorem 14.

Given a $k$ -regular directed graph $G$ of $n$ vertices, one can decide in $O(kn)$ time whether $G$ is $k$ NN-realizable in $\mathbb{R}^{1}$ , and if so, a $k$ NN-realization of $G$ in $\mathbb{R}^{1}$ can be computed in $O(n^{2.5}\mathsf{poly}(\log n))$ time.

4 Concluding remarks and extensions

We considered the problem of realizing a directed graph $G$ as an Euclidean $k$ NN graph. Our key results are: (1) for any fixed $d$ , we can efficiently embed at least a $1-\varepsilon$ fraction of $G$ ’s edges in $\mathbb{R}^{d}$ or conclude that $G$ is not realizable, and (2) a linear time algorithm to decide if $G$ is realizable in $\mathbb{R}^{1}$ . Our theorems extend to the case where the neighbors of each vertex in $G$ are given as a ranked list, meaning that the embedding must satisfy $||\phi(v)-\phi(u_{i})||<||\phi(v)-\phi(u_{i+1})||$ , for $i=1,\ldots,k-1$ , where $u_{i}$ is the $i$ th nearest neighbor of $v$ (except in $\mathbb{R}^{1}$ where we need to solve the LP to decide if it is feasible). Our approximation scheme also applies to other proximity graphs that meet the following conditions: (1) the graph can be partitioned into constant-sized components using a sublinear size separator, and (2) each component’s edges can be embedded independently. For example, we can approximately embed Delaunay triangulations in the plane with maximum degree $k=O(n^{\frac{1}{3}})$ .

References

[1] A. Agrawal, S. Saurabh, and M. Zehavi. A finite algorithm for the realizabilty of a delaunay triangulation. In 17th International Symposium on Parameterized and Exact Computation, volume 249, pages 1:1–1:16, 2022. doi:10.4230/LIPICS.IPEC.2022.1.
[2] N. Alon, M. Bădoiu, E. D. Demaine, M. Farach-Colton, M. Hajiaghayi, and A. Sidiropoulos. Ordinal embeddings of minimum relaxation: General properties, trees, and ultrametrics. ACM Trans. Algorithms, 4(4), 2008. doi:10.1145/1383369.1383377.
[3] Y. Bilu and N. Linial. Monotone maps, sphericity and bounded second eigenvalue. Journal of Combinatorial Theory, Series B, 95(2):283–299, November 2005. doi:10.1016/J.JCTB.2005.04.005.
[4] A. Bogomolnaia and J.-F. Laslier. Euclidean preferences. Journal of Mathematical Economics, 43(2):87–98, 2007.
[5] P. Bose, W. J. Lenhart, and G. Liotta. Characterizing proximity trees. Algorithmica, 16:83–110, 1996. doi:10.1007/BF02086609.
[6] J. Bourgain. On lipschitz embedding of finite metric spaces in hilbert space. Israel J. Math, pages 46–52, 1985.
[7] M. Bădoiu, E. D. Demaine, M. Hajiaghayi, A. Sidiropoulos, and M. Zadimoghaddam. Ordinal embedding: Approximation algorithms and dimensionality reduction. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 21–34, 2008.
[8] V. Chatziafratis and P. Indyk. Dimension-accuracy tradeoffs in contrastive embeddings for triplets, terminals & top-k nearest neighbors. In Symposium on Simplicity in Algorithms, pages 230–243. 2024.
[9] B. Chor and M. Sudan. A geometric approach to betweenness. SIAM J. Discret. Math., 11(4):511–523, 1998. doi:10.1137/S0895480195296221.
[10] J. Chuzhoy, Y. Gao, J. Li, D. Nanongkai, R. Peng, and T. Saranurak. A deterministic algorithm for balanced cut with applications to dynamic connectivity, flows, and beyond. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1158–1167. IEEE, 2020.
[11] M. de Berg, O. Cheong, M. van Kreveld, and M. H. Overmars. Computational Geometry: Algorithms and Applications. Springer, 2008.
[12] M. Dillencourt. Toughness and delaunay triangulations. In Proceedings of the Third Annual Symposium on Computational Geometry, pages 186–194, 1987.
[13] P. Eades and S. Whitesides. The realization problem for euclidean minimum spanning trees is np-hard. In Proceedings of the Tenth Annual Symposium on Computational Geometry, pages 49–56, 1994.
[14] P. Eades and S. Whitesides. The logic engine and the realization problem for nearest neighbor graphs. Theoretical Computer Science, 169(1):23–37, 1996. doi:10.1016/S0304-3975(97)84223-5.
[15] D. Eppstein, M. Paterson, and F. F. Yao. On nearest-neighbor graphs. Discret. Comput. Geom., 17:263–282, 1997. doi:10.1007/PL00009293.
[16] P. Erdős, F. Harari, and W. T. Tutte. On the dimension of a graph. Mathematika, 12:118–122, 1965.
[17] B. Fan, D. Ihara, N. Mohammadi, F Sgherzi, A. Sidiropoulos, and M. Valizadeh. Learning Lines with Ordinal Constraints. APPROX/RANDOM, 176:45:1–45:15, 2020. doi:10.4230/LIPICS.APPROX/RANDOM.2020.45.
[18] W. Johnson and J. Lindenstrauss. Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189–206, 1984.
[19] H. Maehara. Space graphs and sphericity. Discrete Applied Mathematics, 7(1):55–64, 1984. doi:10.1016/0166-218X(84)90113-6.
[20] J. Matousek. Lectures on Discrete Geometry. Springer, 2002.
[21] G. L. Miller, S.H. Teng, W. P. Thurston, and S. A. Vavasis. Separators for sphere-packings and nearest neighbor graphs. J. ACM, 44:1–29, 1997. doi:10.1145/256292.256294.
[22] C. L. Monma and S. Suri. Transitions in geometric minimum spanning trees. Discrete & Computational Geometry, 8:265–293, 1992. doi:10.1007/BF02293049.
[23] J. Renegar. On the computational complexity and geometry of the first-order theory of the reals. Journal of Symbolic Computation, 13(3):255–299, 1992.
[24] K. Sugihara. Simpler proof of a realizability theorem on delaunay triangulations. Information Processing Letters, 50(4):173–176, 1994. doi:10.1016/0020-0190(94)00027-1.

[bib.bib1] [1] A. Agrawal, S. Saurabh, and M. Zehavi. A finite algorithm for the realizabilty of a delaunay triangulation. In 17th International Symposium on Parameterized and Exact Computation, volume 249, pages 1:1–1:16, 2022. doi:10.4230/LIPICS.IPEC.2022.1.

[bib.bib2] [2] N. Alon, M. Bădoiu, E. D. Demaine, M. Farach-Colton, M. Hajiaghayi, and A. Sidiropoulos. Ordinal embeddings of minimum relaxation: General properties, trees, and ultrametrics. ACM Trans. Algorithms, 4(4), 2008. doi:10.1145/1383369.1383377.

[bib.bib3] [3] Y. Bilu and N. Linial. Monotone maps, sphericity and bounded second eigenvalue. Journal of Combinatorial Theory, Series B, 95(2):283–299, November 2005. doi:10.1016/J.JCTB.2005.04.005.

[bib.bib4] [4] A. Bogomolnaia and J.-F. Laslier. Euclidean preferences. Journal of Mathematical Economics, 43(2):87–98, 2007.

[bib.bib5] [5] P. Bose, W. J. Lenhart, and G. Liotta. Characterizing proximity trees. Algorithmica, 16:83–110, 1996. doi:10.1007/BF02086609.

[bib.bib6] [6] J. Bourgain. On lipschitz embedding of finite metric spaces in hilbert space. Israel J. Math, pages 46–52, 1985.

[bib.bib7] [7] M. Bădoiu, E. D. Demaine, M. Hajiaghayi, A. Sidiropoulos, and M. Zadimoghaddam. Ordinal embedding: Approximation algorithms and dimensionality reduction. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 21–34, 2008.

[bib.bib8] [8] V. Chatziafratis and P. Indyk. Dimension-accuracy tradeoffs in contrastive embeddings for triplets, terminals & top-k nearest neighbors. In Symposium on Simplicity in Algorithms, pages 230–243. 2024.

[bib.bib9] [9] B. Chor and M. Sudan. A geometric approach to betweenness. SIAM J. Discret. Math., 11(4):511–523, 1998. doi:10.1137/S0895480195296221.

[bib.bib10] [10] J. Chuzhoy, Y. Gao, J. Li, D. Nanongkai, R. Peng, and T. Saranurak. A deterministic algorithm for balanced cut with applications to dynamic connectivity, flows, and beyond. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1158–1167. IEEE, 2020.

[bib.bib11] [11] M. de Berg, O. Cheong, M. van Kreveld, and M. H. Overmars. Computational Geometry: Algorithms and Applications. Springer, 2008.

[bib.bib12] [12] M. Dillencourt. Toughness and delaunay triangulations. In Proceedings of the Third Annual Symposium on Computational Geometry, pages 186–194, 1987.

[bib.bib13] [13] P. Eades and S. Whitesides. The realization problem for euclidean minimum spanning trees is np-hard. In Proceedings of the Tenth Annual Symposium on Computational Geometry, pages 49–56, 1994.

[bib.bib14] [14] P. Eades and S. Whitesides. The logic engine and the realization problem for nearest neighbor graphs. Theoretical Computer Science, 169(1):23–37, 1996. doi:10.1016/S0304-3975(97)84223-5.

[bib.bib15] [15] D. Eppstein, M. Paterson, and F. F. Yao. On nearest-neighbor graphs. Discret. Comput. Geom., 17:263–282, 1997. doi:10.1007/PL00009293.

[bib.bib16] [16] P. Erdős, F. Harari, and W. T. Tutte. On the dimension of a graph. Mathematika, 12:118–122, 1965.

[bib.bib17] [17] B. Fan, D. Ihara, N. Mohammadi, F Sgherzi, A. Sidiropoulos, and M. Valizadeh. Learning Lines with Ordinal Constraints. APPROX/RANDOM, 176:45:1–45:15, 2020. doi:10.4230/LIPICS.APPROX/RANDOM.2020.45.

[bib.bib18] [18] W. Johnson and J. Lindenstrauss. Extensions of lipschitz maps into a hilbert space. Contemporary Mathematics, 26:189–206, 1984.

[bib.bib19] [19] H. Maehara. Space graphs and sphericity. Discrete Applied Mathematics, 7(1):55–64, 1984. doi:10.1016/0166-218X(84)90113-6.

[bib.bib20] [20] J. Matousek. Lectures on Discrete Geometry. Springer, 2002.

[bib.bib21] [21] G. L. Miller, S.H. Teng, W. P. Thurston, and S. A. Vavasis. Separators for sphere-packings and nearest neighbor graphs. J. ACM, 44:1–29, 1997. doi:10.1145/256292.256294.

[bib.bib22] [22] C. L. Monma and S. Suri. Transitions in geometric minimum spanning trees. Discrete & Computational Geometry, 8:265–293, 1992. doi:10.1007/BF02293049.

[bib.bib23] [23] J. Renegar. On the computational complexity and geometry of the first-order theory of the reals. Journal of Symbolic Computation, 13(3):255–299, 1992.

[bib.bib24] [24] K. Sugihara. Simpler proof of a realizability theorem on delaunay triangulations. Information Processing Letters, 50(4):173–176, 1994. doi:10.1016/0020-0190(94)00027-1.

Embedding Graphs as Euclidean kNN-Graphs

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

1.1 Main results

1.2 Basic definitions

2 Approximate 𝒌NN-realization in ℝ𝒅

Definition 1 (approximate kNN-realization).

2.1 Splitting the graph by removing few edges

Lemma 2.

Proof sketch.

Theorem 3 ([10]).

Lemma 4.

Proof.

2.2 Computing the approximate realization

Lemma 5.

Proof.

Theorem 6.

3 𝒌NN-Realization in ℝ𝟏

3.1 Finding the vertex ordering

Observation 7.

Proof.

Lemma 8.

Observation 9.

Proof.

Lemma 10.

Lemma 11.

Proof.

Theorem 12.

Proof.

3.2 Deciding the realizability and computing the realization

Lemma 13.

Theorem 14.

4 Concluding remarks and extensions

References

Embedding Graphs as Euclidean $k$ NN-Graphs

2 Approximate $𝒌$ NN-realization in $\mathbb{R}^{d}$

Definition 1 (approximate $k$ NN-realization).

3 $𝒌$ NN-Realization in $\mathbb{R}^{1}$