Constructing light spanners deterministically in near-linear time

Graph spanners are well-studied and widely used both in theory and practice. In a recent breakthrough, Chechik and Wulﬀ-Nilsen [CW18] improved the state-of-the-art for light spanners by constructing a (2 k − 1)(1 + ε )-spanner with O ( n 1+ 1 / k ) edges and O ε ( n 1 / k ) lightness. Soon after, Filtser and Solomon [FS20] showed that the classic greedy spanner construction achieves the same bounds. The major drawback of the greedy spanner is its running time of O ( mn 1+ 1 / k ) (which is faster than [CW18]). This makes the construction impractical even for graphs of moderate size. Much faster spanner constructions do exist but they only achieve lightness Ω ε ( kn 1 / k ), even when randomization is used. The contribution of this paper is deterministic spanner constructions that are fast, and achieve similar bounds as the state-of-the-art slower constructions. Our ﬁrst result is an O ε ( n 2+ 1 / k + ε 0 ) time spanner construction which achieves the state-of-the-art bounds. Our second result is an O ε ( m + n log n ) time construction of a spanner with (2 k − 1)(1+ ε ) stretch, O (log k · n 1+ 1 / k ) edges and O ε (log k · n 1 / k ) lightness. This is an exponential improvement in the dependence on k compared to the previous result with such running time. Finally, for the important special case where k = log n , for every constant ε > 0, we provide an O ( m + n 1+ ε ) time construction that produces an O (log n )-spanner with O ( n ) edges and O (


Introduction
A fundamental problem in graph data structures is compressing graphs such that certain metrics are preserved as well as possible.A popular way to achieve this is through graph spanners.Graph spanners are sparse subgraphs that approximately preserve pairwise shortest path distances for all vertex pairs.Formally, we say that a subgraph H = (V, E , w) of an edge-weighted undirected graph G = (V, E, w) is a t-spanner of G if for all u, v ∈ V we have d H (u, v) ≤ t • d G (u, v), where d X is the shortest path distance function for graph X and w is the edge weight function.Under such a guarantee, we say that our graph spanner H has stretch t.In the following, we assume that the underlying graph G is connected; if it is not, we can consider each connected component separately when computing a spanner.
The two main measures of the sparseness of a spanner H is the size (number of edges) and the lightness, which is defined as the ratio w(H)/w(M ST (G)), where w(H) resp.w(M ST (G)) is the total weight of edges in H resp. a minimum spanning tree (MST) of G.It has been established that for any positive integer k, a (2k − 1)-spanner of O(n 1+1/k ) edges exists for any n-vertex graph [Awe85].This stretch-size tradeoff is widely believed to be optimal due to a matching lower bound implied by Erdős' girth conjecture [Erd64], and there are several papers concerned with constructing spanners efficiently that get as close as possible to this lower bound [TZ05,BS07,RZ11].
Obtaining spanners with small lightness (and thus total weight) is motivated by applications where edge weights denote e.g.establishing cost.The best possible total weight that can be achieved in order to ensure finite stretch is the weight of an MST, thus making the definition of lightness very natural.The size lower bound of the unweighted case provides a lower bound of Ω(n1/k ) lightness under the girth conjecture, since H must have size and weight Ω(n 1+1/k ) while the MST has size and weight n − 1. Obtaining this lightness has been the subject of an active line of work [ADD + 93, CDNS92, ENS15, CW18, FS20].Throughout this paper we say that a spanner is optimal when its bounds coincide asymptotically with those of the girth conjecture.Obtaining an efficient spanner construction with optimal stretch-lightness trade-off remains one of the main open questions in the field of graph spanners.
Light spanners.Historically, the main approach of obtaining a spanner of bounded lightness has been through different analyses of the classic greedy spanner.Given t ≥ 1, the greedy t-spanner is constructed as follows: iterate through the edges in non-decreasing order of weight and add an edge e to the partially constructed spanner H if the shortest path distance in H between the endpoints of e is greater than t times the weight of e.The study of this spanner algorithm dates back to the early 90's with its first analysis by Althöfer et al. [ADD + 93].They showed that this simple procedure with stretch 2k − 1 obtains the optimal O(n 1+1/k ) size, and has lightness O(n/k).The algorithm was subsequently analyzed in [CDNS92, ENS15,FS20] with stretch (1 + ε)(2k − 1) for any 0 < ε < 1.Recently, a break-through result of Chechik and Wulff-Nilsen [CW18] showed that a significantly more complicated spanner construction obtains nearly optimal stretch, size and lightness giving the following theorem.
Theorem 1 ([CW18]).Let G = (V, E, w) be an edge-weighted undirected n-vertex graph and let k be a positive integer.Then for any 0 < ε < 1 there exists a (1 + ε)(2k − 1)-spanner of size O(n 1+1/k ) and lightness O ε (n 1/k ). 1  Following the result of [CW18] it was shown by Filtser and Solomon [FS20] that this bound is matched by the greedy spanner.In fact, they show that the greedy spanner is existentially optimal, meaning that if there is a t-spanner construction achieving an upper bound m(n, t) resp.l(n, t) on the size resp.lightness of any n-vertex graph then this bound also holds for the greedy t-spanner.In particular, the bounds in Theorem 1 also hold for the greedy spanner.Efficient spanners.A major drawback of the greedy spanner is its O(m•(n 1+1/k +n log n)) construction time [ADD + 93].Similarly, Chechik and Wulff-Nilsen [CW18] only state their construction time to be polynomial, but since they use the greedy spanner as a subroutine, it has the same drawback.Adressing this problem, Elkin and Solomon [ES16] considered efficient construction of light spanners.They showed how to construct a spanner with stretch (1+ε)(2k−1), size O ε (kn 1+1/k ) and lightness O ε (kn 1/k ) in time O(km + min(n log n, mα(n))).Improving on this, a recent paper of Elkin and Neiman [EN19] uses similar ideas to obtain stretch (1+ε)(2k−1), size O(log k • n 1+1/k ) and lightness O(kn 1/k ) in expected time O(m + min(n log n, mα(n))).
Several papers also consider efficient constructions of sparse spanners, which are not necessarily light.Baswana and Sen [BS07] gave a (2k −1)-spanner with O(kn 1+1/k ) edges in O(km) expected time.This was later derandomized by Roditty et al. [RTZ05] (while keeping the same sparsity and running time).Recently, Miller et al. [MPVX15] presented a randomized algorithm with O(m + n log k) running time and O(log k • n 1+1/k ) size at the cost of a constant factor in the stretch O(k).
It is worth noting that for super-constant k, none of the above spanner constructions obtain the optimal O(n 1+1/k ) size or O(n 1/k ) lightness even if we allow O(k) stretch.If we are satisfied with nearly-quadratic running time, Elkin and Solomon [ES16] gave a spanner with (1+ε)(2k −1) stretch, O ε (n 1+1/k ) size and O ε (kn 1/k ) lightness in O(kn 2+1/k ) time by extending a result of Roditty and Zwick [RZ11] who got a similar result but with unbounded lightness.However, this construction still has an additional factor k in the lightness.Thus, the fastest known spanner construction obtaining optimal size and lightness is the classic greedy spanner -even if we allow O(k) stretch or o(kn 1/k ) lightness.
We would like to emphasize that the case k = log n is of special interest.This is the point on the tradeoff curve allowing spanners of linear size and constant lightness.Prior to this paper, the state of the art for efficient spanner constructions with constant lightness suffered from distortion at least O(log 2 n).See the discussion after Corollary 1 for further details.
A summary of spanner algorithms can be seen in Table 1.

Our results
We present the first spanner obtaining the same near-optimal guarantees as the greedy spanner in significantly faster time by obtaining a (1 + ε)(2k − 1) spanner with optimal size and lightness in O ε (n 2+1/k+ε ) time.We also present a variant of this spanner, improving the running time to O(m + n log n) by paying a log k factor in the size and lightness.Finally, we present an optimal O ε (log n)-spanner which can be constructed in O(m + n 1+ε ) time.This special case is of particular interest in the literature (see e.g.[BFN19,KX16]).Furthermore, all of our constructions are deterministic, giving the first subquadratic deterministic construction without the additional dependence on k in the size of the spanner.As an important tool, we introduce a new deterministic approximate incremental distance oracle which works in near-linear time for maintaining small distances approximately.We believe this result is of independent interest.More precisely, we show the following theorems.
Theorem 2. Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, any positive integer k, and ε, ε > 0 where ε arbitrarily close to 0 and ε is a constant, one can Table 1: Table of related spanner constructions.In the top of the table we list non-efficient spanner constructions.In the middle we list known efficient spanner constructions.In the bottom we list our contributions.Results marked * are different analyses of the greedy spanner.Results marked # are randomized.Lightness complexities marked * * are from the analysis in Section 9 and W denotes the maximum edge weight of the input graph.The bounds hold for any constant ε, ε > 0.
Theorem 3. Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, a positive integer k ≥ 640, and > 0, one can deterministically construct a (2k Note that in Theorem 3 we require k to be larger than 640.This is not a significant limitation, as for k = O(1) [ES16] is already optimal.
Our O(log n)-spanner is obtained as a corollary of the following more general result.
Theorem 4. Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, any positive integer k and constant ε > 0, one can deterministically construct an We note that the stretch O(k) of Theorem 4 (and Corollary 1 below) hides an exponential factor in 1/ε , thus we only note the result for constant ε .Bartal et.al. [BFN19] showed that given a spanner construction that for every n-vertex weighted graph produces a t(n)stretch spanner with m(n, t) edge and l(n, t) lightness in T (n, m) time, then for every parameter 0 < δ < 1 and every graph G, one can construct a t/δ-spanner with m(n, t) edges and 1 + δ • l(n, t) lightness in T (n, m) + O(m) time.Plugging k = log n and using this reduction w.r.t.δ in Theorem 4, and δ = δ log log n in Theorem 3, we get Corollary 1.Let G = (V, E, w) be a weighted undirected n-vertex graph, let ε > 0 be a constant and δ > 0 be a parameter arbitrarily close to 0. Then one can construct a spanner of G with:

O(log n log log n)/δ stretch, O(n log log n) edges and 1 + δ lightness in time O(m + n log n).
Corollary 1 above should be compared to previous attempts to efficiently construct a spanner with constant lightness.Although not stated explicitly, the state-of-the-art algorithms of [ES16,EN19], combined with the lemma from [BFN19], provide an efficient spanner construction with 1 + δ lightness, O(n log log n) edges and only O(log 2 n/δ) stretch.
We emphasize, that Corollary 1 is the first sub-quadratic construction of spanner with optimal size and lightness for any non-constant k.
In order to obtain Theorem 4 we construct the following deterministic incremental approximate distance oracle with near-linear total update time for maintaining small distances.We believe this result is of independent interest, and discuss it in more detail in the related work section below and in Section 3.
Theorem 5. Let G be a graph that undergoes a sequence of m edge insertions.For any constant ε > 0 and parameter d ≥ 1 there exists a data structure which processes the m insertions in total time O(m 1+ε • d) and can answer queries at any point in the sequence of the following form.Given a pair of nodes u, v, the oracle gives, in O(1) time, an estimate Theorem 5 assumes that ε is constant; the O-notation hides a factor exponential in 1/ε for both total update time and stretch whereas the query time bound only hides a factor of 1/ε .
We also obtain the following sparse, but not necessarily light, spanner in linear time as a subroutine in proving Theorem 3. Theorem 6.Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, any positive integer k, any > 0, and any positive integer Here, the function log (s) is log concatenated with itself s times.Specifically, log (0) n = n, log (1) n = log n, and in general for s ≥ 1, log (s) n = log(log (s−1) n).log * n is the minimum index s such that log (s) n ≤ 2.
Note that since we may assume that k = O(log n), the time bound of Theorem 6 is linear for almost all choices of k and very close to linear for any choice of k.

Organization
In Section 4 we state our framework that used in Theorems 2 to 4. Theorem 4 is proved in Section 5, and Theorem 2 in Section 5.2.Theorem 3 is proved in Section 6. Theorem 5 is proved in Section 8.The proof of Theorem 6 appears in Section 7.

Related work
Closely related to graph spanners are approximate distance oracles (ADOs).An ADO is a data structure which, after preprocessing a graph G, is able to answer distance queries approximately.Distance oracles are studied extensively in the literature (see e.g.[TZ05,Wul13,Che14,Che15]) and often use spanners as a building block.The state of the art static distance oracle is due to Chechik [Che15], where a construction of space O(n 1+1/k ), stretch 2k − 1, and query time O(1) is given.Our distance oracle of Theorem 5 should be compared to the result of Henzinger, et al. [HKN16], who gave a deterministic construction for incremental (or decremental) graphs with a total update time of O ε (mn log n), a query time of O(log log n) and stretch 1 + ε.For our particular application, we require near-linear total update time and only good stretch for short distances, which are commonly the most troublesome when constructing spanners.It should be added that Henzinger et al. give a general deterministic data structure for choosing centers, i.e., vertices which are roots of shortest path trees maintained by the data structure.While this data structure may be fast when the total number of centers is small, we need roughly n centers and it is not clear how this number can be reduced.Having this many centers requires at least order mn time with their data structure.
To achieve our fast update time bound, we are interested in trading worse stretch for distances above parameter d for construction time.Roditty and Zwick [RZ12] gave a randomized distance oracle for this case, however their construction does not work against an adaptive adversary as is required for our application, where the edges to be inserted are determined by the output to the queries of the oracle (see Section 3 for more discussion on this).Removing the assumption of a non-adaptive adversary in dynamic graph algorithms has seen recent attention at prestigious venues, e.g.[Wul17,BHN16].Our new incremental approximate distance oracle for short distances given in Theorem 5 is deterministic and thus is robust against such an adversary, and we believe it may be of independent interest as a building block in deterministic dynamic graph algorithms.
For unweighted graphs, there is a folklore spanner construction by Halperin and Zwick [HZ96] which is optimal on all parameters.The construction time is O(m), it has O(n 1+1/k ) edges and 2k − 1 stretch.In Section 6 we will use this spanner as a building block in proving Theorem 3.

Preliminaries
Consider a weighted graph G = (V, E, w), we will abuse notation and refer to as E both a set of edges and the graph itself.d G will denote the shortest path metric (that is 2 as its edges and w as weight function.The diameter of a vertex set V in a graph G diam G (V ) = max u,v∈V d G (u, v) is the maximal distance between two vertices in V under the shortest path metric induced by G .For a set of edges A with weight function w, the aspect ratio of A is max e∈A w(e)/ min e∈A w(e).The sparsity of A is simply |A| its size.
We will assume that k = O(log n) as the guarantee for lightness and sparsity will not be improved by picking larger k.Instead of proving (1 + ε)(2k − 1) bound on stretch, we will prove only (1 + O(ε))(2k − 1) bound.This is good enough, as Post factum we can scale ε accordingly.By O ε we denote asymptotic notation which hides polynomial factors of

Paper overview
General framework Theorems 2 to 4 are generated via a general framework.The framework is fed two algorithms for spanner constructions: A 1 , an algorithm suitable for graphs with small aspect ratio, and A 2 , an algorithm that returns a sparse spanner, but with potentially unbounded lightness.We consider a partition of the edges into groups according to their weights.For treating most of the groups we use exponentially growing clusters, partitioning the edges according to weight.Each such group has bounded aspect ratio, and thus we can use A 1 .Due to the exponential growth rate, we show that the contribution of all the different groups is converging.Thus only the first group is significant.However, with this approach we need a special treatment for edges of small weight.This is, as using the previous approach, the number of clusters needed to treat light edges is unbounded.Nevertheless, these edges have small impact on the lightness and we may thus use algorithm A 2 , which ignores this property.
The main work in proving Theorems 2 to 4 is in designing the algorithms A 1 and A 2 described briefly below.

Approximate greedy spanner
The major time consuming ingredient of the greedy spanner algorithm is its shortest path computations.By instead considering approximate shortest path computations we significantly speed this process up.We are the first to apply this idea on general graphs, while it has previously been applied by [DN97,FS20] on particular graph families.Specifically, we consider the following algorithm: given some parameters t < t , initialize H ← ∅ and consider the edges (u, v) ∈ E according to increasing order of weight.If ), the algorithm is forbidden to add (u, v) to H. Otherwise, the algorithm is free to include the edge or not.As a result, we will get spanner with stretch t , which has the same lightness and sparsity guarantees of the greedy t-spanner.Note however, that the resulting spanner is not necessarily a subgraph of any greedy spanner.
We obtain both Theorem 2 and Theorem 4 using this approach via an incremental approximate distance oracle.It is important to note that the edges inserted into H using this approach depend on the answers to the distance queries.It is therefore not possible to use approaches that do not work against an adaptive adversary such as the result of Roditty and Zwick [RZ12], which is based on random sampling.Furthermore, this is the case even if we allow the spanner construction itself to be randomized.In order to obtain Theorem 2, we use our previously described framework coupled with the "approximately greedy spanner" using an incremental (1 + ε)-approximate distance oracle of Henzinger et al. [HKN16].For Theorem 4, we present a novel incremental approximate distance oracle, which is described below.This is the main technical part of the paper and we believe that it may be of independent interest.

Deterministic distance oracle
The main technical contribution of the paper and key ingredient in proving Theorem 4 is our new deterministic incremental approximate distance oracle of Theorem 5.The oracle supports approximate distance queries of pairs within some distance threshold, d.In particular, we may set d to be some function of the stretch of the spanner in Theorem 4. Similar to previous work on distance oracles, we have some parameter, k, and maintain k sets of nodes ∅ = A k−1 ⊆ . . .⊆ A 0 = V , and for each u ∈ A i we maintain a ball of radius r ≤ d i .Here, d i is a distance threshold depending on the parameter d and which set A i we are considering, and r is chosen such that the total degree of nodes in the ball of radius r from u is relatively small.The implementation of each ball can be thought of as an incremental Even-Shiloach tree.The set A i+1 is then chosen as a maximal set of nodes with disjoint balls (see Figure 3 in Section 8.1).Here we use the fact that the vertices in A i+1 are centers of disjoint balls in A i to argue that A i+1 is much smaller than A i .The decrease in size of A i+1 pays for an increase in the maximum ball radius d i at each level.The ball of a node u may grow in size during edge insertions.In this case, we freeze the ball associated with u, shrink the radius r associated with u, and create a new ball with the new radius.Thus, for each A i we end up with O(log d) different radii for which we pick a maximal set of nodes with disjoint balls.For each node u i ∈ A i we may then associate a node u i+1 ∈ A i+1 whose ball intersects with u i 's.We use these associated nodes in the query to ensure that the path distance we find is not "too far away" from the actual shortest path distance.Consider a query pair (u, v).Then the query algorithm iteratively finds a sequence of vertices u = u 0 ∈ A 0 , u 1 ∈ A 1 , ..., u i ∈ A i ; d i is picked such that if v is not in the ball centered at u i with radius d i then the shortest path distance between u and v is at least d and the algorithm outputs ∞.Otherwise, the algorithm uses the shortest path distances stored in the balls that it encounters to output the weight of a uv-path v as an approximation of the shortest path distance between u and v.
Almost linear spanner Chechik and Wulff-Nilsen [CW18] implicitly used our general framework, but used the (time consuming) greedy spanner both as their A 2 component and as a sub-routine in A 1 .We show an efficient alternative to the algorithm of [CW18].For the A 2 component we provide a novel sparse spanner construction (Theorem 6, see paragraph below).For A 1 , we perform a hierarchical clustering, while avoiding the costly exact diameter computations used in [CW18].Finally, we replace the greedy spanner used as a sub-routine of [CW18] by an efficient spanner that exploits bounded aspect ratio (see Lemma 5).This spanner can be seen as a careful adaptation of Elkin and Solomon [ES16] analyzed in the case of bounded aspect ratio.The idea here is (again) a hierarchical partitioning of the vertices into clusters of exponentially increasing size.However, here the growth rate is only (1 + ε).Upon each clustering we construct a super graph with clusters as vertices and graph edges from the corresponding weight scale as inter-cluster edges.To decide which edges in each scale add to our spanner, we execute the extremely efficient spanner of Halperin and Zwick [HZ96] for unweighted graphs.
Linear time sparse spanner As mentioned above we provide a novel sparse spanner construction as a building block in proving Theorem 3. Our construction is based on partitioning edges into O ε (log k) "well separated" sets E 1 , E 2 , . .., such that the ratio between w(e) and w(e ) for edges e, e ∈ E i is either a constant or at least k.This idea was previously employed by Elkin and Neiman [EN19] based on [MPVX15].For these well-separated graphs, Elkin and Neiman used an involved clustering scheme based on growing clusters according to exponential distribution, and showed that the expected number of inter-cluster edges, in all levels combined, is small enough.We provide a linear time deterministic algorithm with an arguably simpler clustering scheme.Our clustering is based upon the clusters defined implicitly by the spanner for unweighted graphs of Halperin and Zwick [HZ96].In particular, we introduce a charging scheme, such that each edge added to our spanner is either paid for by a large cluster with many coins, or significantly contributing to reduce the number of clusters in the following level.

A framework for creating light spanners efficiently
In this section we describe a general framework for creating spanners, which we will use to prove our main results.The framework is inspired by a standard clustering approach (see e.g.[ES16] and [CW18]).The spanner framework takes as input two spanner algorithms for restricted graph classes, A 1 and A 2 , and produces a spanner algorithm for general graphs.The algorithm A 1 works for graphs with unit weight MST edges and small aspect ratio, and A 2 creates a small spanner with no guarantee for the lightness.The main work in showing Theorems 2, 3, and 4 is to construct the algorithms, A 1 and A 2 , that go into Lemma 1 below.We do this in Sections 5 and 6.The framework is described in the following lemma.
Lemma 1.Let G = (V, E) be a weighted graph with n nodes and m edges and let k > 0 be an integer, g > 1 a fixed parameters and ε > 0. Assume that we are given two spanner construction algorithms A 1 and A 2 with the following properties: ) when given a graph with maximum weight g k , where all MST edges have weight 1.Moreover, T 1 has the property that Then one can compute a spanner of stretch max((1+ε As an example, let us assume that we have both an optimal spanner algorithm for graphs with small aspect ratio, and an optimal spanner algorithm for sparse spanners in weighted graph.Specifically, we have algorithm A 1 that given a graph as above creates a (1 + ε)(2k − 1)-spanner with O ε (n 1+1/k ) edges and lightness O ε (n 1/k ) in O ε (m + n log n) time.In addition we have algorithm A 2 that returns an (1 + ε)(2k − 1)-spanner with O ε (n 1+1/k ) edges in O ε (m) time.Then, given a general graph, Lemma 1 provide us with a Before proving Lemma 1 we need to describe the clustering approach.The main tool needed is what we call an (i, ε)-clustering.This clustering procedure is performed on graphs where all the MST edges have unit weight.Let G, g, ε, k be as in Lemma 1, then we say that an (i, ε)-clustering is a partitioning of V into clusters C 1 , . . ., C n i , such that each C j contains at least εg ik nodes and has diameter at most 4εg ik (even when restricted to MST edges of G).Let G i denote the graph obtained by contracting the clusters of such an (i, ε)-clustering of G, and keeping the MST edges only.Then G i has n i nodes, and we can construct G i from G i−1 as follows.Start at some vertex v in G i−1 (corresponding to an (i − 1, ε)-cluster) and iteratively grow an (i, ε)-cluster ϕ v by joining arbitrary un-clustered neighbors to ϕ v in G i−1 one at a time.If the number of original vertices in ϕ v reaches εg ik , make ϕ v into an (i, ε)-cluster, where the current vertices in ϕ v are called its core.We argue that the diameter of the core is bounded by εg ik + 4εg (i−1)k .If the vertex v (from G i ) already contains εg ik vertices, then ϕ v = v and by the induction hypothesis the diameter of ϕ v is at most 4εg (i−1)k .Otherwise (|v| < εg ik ), consider the last vertex u ∈ G i−1 to join ϕ v .As u joins ϕ v , necessarily |ϕ v \ u| < εg ik .In particular, the diameter of ϕ v \ u (restricted to MST edges) is at most g ik − 1.The diameter of u is at most 4εg (i−1)k and therefore the diameter of ϕ v is indeed bounded by εg ik + 4εg (i−1)k .
We perform this procedure starting at an un-clustered vertex until all vertices of G i−1 belong to some (i, ε)-cluster.In the case where ϕ v has no un-clustered neighbors, but does not contain εg ik vertices, we simply merge it with an existing (i, ε)-cluster ϕ u via an MST-edge to the core of ϕ u .Note that the size of ϕ v , and therefore its diameter, before the merging is at most εg ik − 1 (as each cluster is connected when restricting to MST edges).To show that this gives a valid (i, ε)-clustering, consider an (i, ε)-cluster ϕ v with core φv .Suppose that the "sub-clusters" φu 1 , . . ., φus were merged into φv during this process.The diameter of ϕ v is then bounded by diam( φv ) + 2 + max j,j (diam( φu j ) + diam( φu j )) ≤ (εg ik + 4εg (i−1)k ) + 2εg ik ≤ 4εg ik .
Moreover, the size of ϕ v is at least the size of its core, | φv | ≥ εg ik .See figure Figure 1 for illustration.
Note that we have n i ≤ n εg ik .Using the above procedure, we can construct the (i + 1, ε) clustering from the (i, ε) clustering in O(n i ) time.Therefore we can construct the clusters for all the levels in O time (if we are given the MST).With this tool in hand, we may now prove Lemma 1.
Proof of Lemma 1.The proof constructs an algorithm consisting of two phases.The preparation phase where A 2 is used to reduce the problem to a graph where all MST edges have weight 1, and the bootstrapping phase where we perform an iterative clustering of the graph to obtain several graphs with small aspect-ratio, where we can apply A 1 .
v iteratively grows a cluster around itself until it contains εg ik original vertices.This current cluster is called φv .φv also called the core of ϕ v , an (i, ε)-cluster that we will have at the end of the process.Afterwards, the (i − 1, ε)-cluster u 1 iteratively grow a cluster around itself.When the temporary cluster is φu1 there are no outgoing edges to unclustered vertices.However, since G i−1 is connected, there is some outgoing edge.Necessarily the second endpoint of this edge belongs to the core of an existing cluster (here, φv ), that is, as φv stop growing while still having unclustered neighbors.All the vertices of φu1 joins the cluster of v.In a similar manner, φu2 and φu3 are also joined into the cluster of v.In the end of the algorithm, the (i, ε)- Preparation phase: Let T be an MST of G and let w = (u,v)∈T w(u,v) n−1 .Define G 2 to be G with all edges of weight greater than w /ε removed and let H2 be the spanner resulting from running A 2 on G 2 .Next, we construct G 1 from G as follows.First, round up the weight of each edge in G to the nearest multiple of w .For each edge e ∈ T subdivide it such that each edge of the resulting MST has weight w 2 .As the weight of each edge increase by at most an additive factor of w , the weight of the MST increase by at most (n − 1)w ≤ w(T ).The new number of vertices is bounded by (u,v)∈T Finally, divide the weight of each edge by w .This finishes the construction of G 1 .
Bootstrapping phase: We will now use A 1 to make a spanner H 1 for the graph G 1 created above.We start by partitioning the edges into sets E i , where E i contains all edges of G 1 with weights in [g ik , g (i+1)k ).Note that since each MST edge of G 1 has weight 1 we only need to consider edges with weight up to O(n).Next, we let T be an MST of G 1 and for all i = 0, 1, . . ., O(log n) we create T i by contracting all clusters of an (i, ε)-clustering of T , where the (i, ε)-clustering is computed as described above.Note that T i is also a tree since each cluster is a connected subtree of T .We now construct graphs G i by taking T i and adding any minimum weight edge of E i going between each pair of clusters (i.e.nodes corresponding to clusters).Finally, we divide the weight of each non-MST edge of G i by g ik .This gives us a graph with maximum weight g k , where MST edges have weight 1.We call this new weight function w i .Let H i be the spanner obtained by running algorithm A 1 on G i .Finally, let H 1 be the union of all H i s, where each edge of H i is replaced by the corresponding edge(s) from G.
Analysis: We set the final spanner To bound the stretch of H first note that any edge of G 2 has stretch at most f 2 (k) from H 2 .What remains is to bound the stretch of non-MST edges (u, v) with w(u, v) ≥ w /ε.First, observe that the rounding procedure used to create G 1 can at most increase the weight of (u, v) in G 1 by a factor of (1 + ε) compared to G. Now assume that (u, v) ∈ E i for some i.Let ϕ u and ϕ v denote the clusters containing u, respectively, v in G i .If ϕ u = ϕ v we know that the distance between u and v using the MST is The thin black curves represent MST paths.There is an edge e between ϕ v to ϕ u in G i .Therefore H i contains a short path between ϕ v to ϕ u .at most 4εg ik and we are done.Thus, assume that ϕ u = ϕ v .By definition of G i , there must be some edge Furthermore, the diameter of each cluster, ϕ zq , is at most 4εg ik .We now conclude (see Figure 2 for illustration) Next we consider the size and lightness of H. First we see that, since G 2 is a subgraph of G, the spanner H 2 has size at most O(s 2 (k) • n 1+1/k ).Furthermore since every edge in G 2 has weight at most w /ε the total weight of Recall that n i is the number of (i, ε) clusters, and therefore also the number of nodes in T i .We can bound the total weight of H 1 by and thus also G.The size can be bounded in a similar fashion.The total running time of the algorithm is O(m + n log n) to find the MST of G and divide edges to

Efficient approximate greedy spanner
In this section we will show how to efficiently implement algorithms A 1 and A 2 of Lemma 1 in order to obtain Theorems 2 and 4. We do this by implementing an "approximate-greedy" spanner, which uses an incremental approximate distance oracle to determine whether an edge should be added to the spanner or not.We first prove Theorem 4 and then show in Section 5.2 how to modify the algorithm to give Theorem 2. We will use Theorem 5 as a main building block, but defer the proof of this theorem to Section 8. Our A 1 is obtained by the following lemma giving stretch O(k) and optimal size O(n 1+1/k ) and lightness O(n 1/k ) for small weights.
Lemma 2. Let G = (V, E, w) be an undirected graph with m = |E| and n = |V | and integer edge weights bounded from above by W .Let k be a positive integer and let ε > 0 be a constant.

Then one can deterministically construct an
We note that Lemma 2 above requires integer edge weights, but we may obtain this by simply rounding up the weight of each edge losing at most a factor of 2 in the stretch.Alternatively we can use the approach of Lemma 4 in Section 5.2 to reduce this factor of 2 to (1 + ε).
Our A 2 will be obtained by the following lemma, which is essentially a modified implementation of Lemma 2. Combining Lemma 1 of Section 4 with Lemmas 2 and 3 above immediately gives us a spanner with stretch O(k), size O(n 1+1/k ) and lightness O(n 1/k ) in time O(m + n 1+1/k+ε ) for any constant ε > 0. This is true because we may assume that k ≤ γ log n for any constant γ > 0 (as the improvement in sparsity and lightness obtained by picking k > γ log n is bounded by 2 γ ), and thus by picking γ and ε accordingly we have that the running time given by Lemma 1 can be bounded by

Details of the almost-greedy spanner
Set ε = 13 .Our algorithm for Lemma 2 is described below in Algorithm 1.It computes a spanner of stretch c 1 (1+ε)(2k −1), where c 1 = O(1) is the stretch of our incremental approximate distance oracle in Theorem 5. Let t = c 1 (1 + ε)(2k − 1) throughout the section.With Algorithm 1 defined we are now ready to prove Lemma 2.
Algorithm 1: Approximate-Greedy input : Let H be the spanner created by running Algorithm 1 on the input graph G with the input parameters.
Stretch: We will bound the stretch by showing that for any edge (u, v) ∈ E there is a path of length at most t • w(u, v) in H. Let (u, v) be any edge considered in the for loop of Algorithm 1.If (u, v) was added to H we are done.Thus, assume that (u, v) / ∈ H.In this case we have dH (u, v) ≤ t • w(u, v) as (u, v) would have been otherwise added to H.The lemma now follows by noting that d(u, v) ≤ dH (u, v) by Theorem 5.
Size and lightness: Next we bound the size and lightness of H. Our proof is very similar to the proof of Filtser and Solomon for the greedy spanner [FS20].However, we need to be careful as we are using an approximate distance oracle and do not have the exact distances when inserting an edge.Let H be any spanner of H with stretch (1 + ε)(2k − 1).We will argue that H = H.To see this let (u, v) ∈ H \ H be any edge contradicting the above statement.Then there must be a path P in H connecting u and v with w(P ) follows that all the edges of (P ∪ {(u, v)}) \ (x, y) were already in H when (x, y) was added.These edges form a path in H connecting x and y of weight It follows that d(x, y) ≤ (1 + ε)(2k − 1) • w(x, y) ≤ d just before (x, y) was added to H, and by Theorem 5 that dH (x, y) ≤ t • w(x, y).Thus Algorithm 1 did not add the edge (x, y) to H, which is a contradiction.We conclude that H = H.Now, since H could be any spanner of H, we may in particular choose it to be the (1+ε)(2k−1) spanner from Theorem 1.It now follows immediately that H = H has size O(n 1+1/k ).For the lightness we know that H has lightness O(n 1/k ) with regard to the MST of H. Thus, if we can show that the MST of H is the same as the MST of G we are done.However, this follows by noting that Algorithm 1 adds exactly the MST of G to H that would have been added by Kruskal's algorithm [Kru56], since each such edge connects two disconnected components.Thus the MST of G and H have the same weight which completes the proof.

Running time:
In Algorithm 1 we perform m queries to the incremental distance oracle of Theorem 5 each of which take O(1) time.We also perform |E(H)| insertions to the incremental distance oracle.We invoke Theorem 5 using ε * picked such that 1/ε * is integer and ε * + ε * /k ≤ ε .Since d = O(kW ), it follows from Theorem 5 and the size bound above that running time of the for-loop of Algorithm 1 is To achieve the non-decreasing order we may simply run the algorithm of Baswana and Sen [BS07] first with parameter k = 1/ε .This gives an additional factor of O(1/ε ) to the stretch, but leaves us with a graph with only O(n 1+ε ) edges which we may then sort.
Next, we sketch the proof Lemma 3, by explaining how to modify the proof of Lemma 2.
Proof of Lemma 3. Recall that c 1 is defined as the constant stretch provided by Theorem 5. We use Algorithm 1 with the following modifications: (1) we pick d = c 1 (2k − 1), (2) when adding an edge to the distance oracle we add it as an unweighted edge, (3) we add an edge if its endpoints are not already connected by a path of at most d edges according to the approximate distance oracle.
The stretch of the spanner follows by the same stretch argument as in Lemma 2 and the fact that we consider the edges in non-decreasing order.To see that the size of the spanner is O(n 1+1/k ) consider an edge (u, v) added to H by the modified algorithm.Since (u, v) was added to H we know that the distance estimate was at least c 1 (2k − 1).It thus follows from Theorem 5 that u and v have distance at least 2k in H and therefore H has girth at least 2k + 1.It now follows that H has O(n 1+1/k ) edges by a standard argument.The running time of this modified algorithm follows directly from Theorem 5.

Near-quadratic time implementation
The construction of the previous section used our result from Theorem 5 to efficiently construct a spanner losing a constant factor exponential in 1/ε in the stretch.We may instead use the seminal result of Even and Shiloach [ES81] to obtain the same result with stretch (1+ε)(2k −1) at the cost of a slower running time as detailed in Theorem 2. It is well-known that the decremental data structure in [ES81] can be made to work with the same time guarantees in the incremental setting; we will make use of this result:

Theorem 7 ([ES81]). There exists a deterministic incremental APSP data structure for graphs with integer edge weights, which answers distance queries within a given threshold d in O(1) time and has total update time O(mnd).
Here, the threshold means that if the distance between two nodes is at most d, the data structure outputs the exact distance and otherwise it outputs ∞ (or some other upper bound).
To obtain Theorem 2 we use the framework of Section 4. For the algorithm A 2 we may simply use the deterministic spanner construction of Roditty and Zwick [RZ11] giving stretch 2k − 1 and size O(n 1+1/k ) in time O(kn 2+1/k ).For A 1 we will show the following lemma.
Proof sketch.The final spanner will be a union of two spanners.Since Theorem 7 requires integer weights.We therefore need to treat edges with weight less than 1/ε separately.For these edges we use the algorithm of Roditty and Zwick [RZ11] to produce a spanner with stretch 2k − 1, size O(n 1+1/k ) and thus total weight O(n 1+1/k /ε).
For the remaining edges with weight at least 1/ε we now round up the weight to the nearest integer incurring a stretch of at most a factor of 1 + ε.We now follow the approach of Algorithm 1 using the incremental APSP data structure of Theorem 7 and a threshold in line 4 of ( The final spanner, H, is the union of the two spanners above.The stretch, size and lightness of the spanner follows immediately from the proof of Lemma 2. For the running time, we add in the additional time to sort the edges and query the distances to obtain a total running time of where k ≤ log n and g > 1 is a fixed parameter of our choice.By picking g such that g 2k ≤ n ε we get a running time of O(n 2+1/k+ε ) for A 1 .Theorem 2 now follows from Lemma 1.

Almost Linear Spanner
Our algorithm builds on the spanner of Chechik and Wulff-Nilsen [CW18].Here we first describe their algorithm and then present the modifications.Chechik and Wulff-Nilsen implicitly used our general framework, and thus provide two different algorithms is simply the greedy spanner algorithm.
A CW 1 starts by partitioning the non-MST edges into k buckets, such that the ith bucket contains all edges with weight in [g i−1 , g i ).The algorithm is then split into k levels with the ith bucket being treated in the ith level.In the ith level, the vertices are partitioned into i-clusters, where the i-clusters refine the (i − 1)-clusters.Each i-cluster has diameter O(kg i ) and contains at least Ω(kg i ) vertices.This is similar to the (i, ε)-clusters in Section 4 with the modification of having two types of clusters, heavy and light.A cluster is heavy if it has many incident i-level edges and light otherwise.For a light cluster, we add all the incident i-level edges to the spanner directly.For the heavy clusters, Chechik and Wulff-Nilsen [CW18] create a special auxiliary cluster graph and run the greedy spanner on this to decide which edges should be added.
To bound the lightness of the constructed spanner, they show that each time a heavy cluster is constructed the number of clusters in the next level is reduced significantly.Then, using a clever potential function, they show that the contribution of all the greedy spanners is bounded.It is interesting to note, that in order to bound the weight of a single greedy spanner, they use the analysis of [ENS15].Implicitly, [ENS15] showed that on graphs with O(poly(k)) aspect ratio, the greedy (1 + ε)(2k − 1)-spanner has O ε (n 1/k ) lightness and O(n 1+1/k ) edges.
There are three time-consuming parts in [CW18]: 1) The clustering procedure iteratively grows the i-clusters as the union of several (i − 1)-clusters, but uses expensive exact diameter calculations in the original graph.2) They employ the greedy spanner several times as a subroutine during A CW 1 for graphs with O(poly(k)) aspect ratio.3) They use the greedy spanner as A CW 2 .
In order to handle 1) above we will grow clusters purely based on the number of nodes in the (i − 1)-clusters (in similar manner to (i, ε)-clusters), thus making the clustering much more efficient without losing anything significant in the analysis.To handle 2) We will use the following lemma in place of the greedy spanner.
Lemma 5. Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, a positive integer k, > 0, such that all the weights are within [a, a • ∆), and the MST have weight O(na).One can deterministically construct a (2k The core of Lemma 5 already appears in [ES16], while here we analyze it for the special case where the aspect ratio is bounded by ∆.The main ingredient is an efficient spanner construction by Halperin and Zwick [HZ96] for unweighted graphs (Theorem 10).The description of the algorithm of Lemma 5 and its analysis can be found in Section 10.Replacing the greedy spanner by Lemma 5 above is the sole reason for the additional log k factor in the lightness of Theorem 3.
Imitating the analysis of [CW18] with the modified ingredients, we are able to prove the following lemma, which we will use as A 1 in our framework.Lemma 6.Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, a positive integer k ≥ 640, and > 0, such that all MST edges have unit weight, and all weights bounded by g k , one can deterministically construct a (2k To address the third time-consuming part we instead use the algorithm of Theorem 6 as A 2 .Replacing the greedy algorithm by Theorem 6 is the sole reason for the additional log k factor in the sparsity of Theorem 3.
Combining Lemma 6, Theorem 6 and Lemma 1 we get Theorem 3. The remainder of this section is concerned with proving Lemma 6.

Details of the construction
Algorithm 2 below contains a high-level description of the algorithm.We defer part of the exact implementation details and the analysis of the running time to Section 6.2.We denote Using our modified clustering we will need the following claim which is key to the analysis.The claim is proved in Section 6.2.We refer to the definitions from Algorithm 2 in the following section.

Claim 1.
For each i-level cluster C ∈ C i produced by Algorithm 2 it holds that: 1. C has diameter at most 1 2 kg i (w.r.t to the current stage of the spanner E sp ).
2. The number of vertices in C is larger than its diameter and is at least 1 c kg i .Our analysis builds upon [CW18].The bound on the stretch of Lemma 5 follows as we have only replaced the greedy spanner by alternative spanners with the same stretch (and have similar guaranties on the clusters diameter).The proof appears at Section 11.1.
To bound the sparsity and lightness we consider the two phases of Algorithm 2. During the i'th level of the first phase we add at most d edges per light cluster and at most 1 edge per (i − 1)-cluster to form the heavy clusters.By Claim 1 each i-level cluster contains Ω(kg i ) vertices and thus the total number of clusters over all levels is bounded by k i=0 O( n kg i ) = O(n/k).It follows that we add at most O(n) edges during the first phase.For the lightness of these edges, note that edges added during the ith level have weight at most g i+1 .Hence the total weight added during the ith level is at most O( n kg i−1 • g i+1 ) for heavy clusters and at most O( dn kg i • g i+1 ) for light clusters.Summing over all k levels this contributes at most O(n) to the total weight from the first phase.
Next consider the second phase.First note S 0 has an MST of weight n − 1 and only contains edges with weight in [1, k ε ).Thus, by Lemma 5, Recall the definitions of V r , S r , and H r : V r is a set of vertices representing a subset of the (r − 1)µ-level clusters.S r is a graph with nodes V r where all the edges have weight in [kg (r−1)µ /ε, g (r+1)µ ] = [g rµ , g rµ k/ε].H r is a spanner of S r constructed using Lemma 5. Denote by M r the MST of S r .The following lemma bound its weight.A proof can be found in Section 11.2.*/ 19 Let S 0 be a subgraph of G which contain only edges of weight at most k/ε.Let H 0 be a (2k − 1)(1 + )-spanner of S 0 constructed using Lemma 5 20 Add H 0 to E sp 21 for r = 1 to k/µ − 1 do Let V r to be the set of nodes obtained by contracting each (r − 1)µ cluster contained in some i-level heavy cluster for i ∈ [rµ, (r + 1)µ) (deleting all the other (r − 1)µ-clusters) Let E r be all the edges used to create i-clusters (heavy or light) for i ∈ ((r − 1)µ, (r + 1)µ] Let S r be the graph with V r as its vertices and E r ∪ Summing over all the indices r, we can bound the number of edges added in second phase by Using a potential function, we show that the sum of the weights r w(H r ) converges nicely.The details can be found in Section 11.2.

Lemma 8. The total weight of the spanners constructed in the second phase of Algorithm 2 is
The size and lightness of Lemma 6 now follows.All that is left is to describe the exact implementation details and analyze the running time, which is done below.

Exact implementation of Algorithm 2
In this section we give a detailed description of Algorithm 2 and bound its running time.In addition we prove Claim 1.

First phase Let m
To efficiently facilitate certain operations we will maintain a forest T representing the hierarchy of containment between the clusters in different levels.Specifically, T will have levels going from −1 to k.For simplicity, we treat each vertex v ∈ V as a −1-cluster.Each i-cluster ϕ will be represented by an i-level node v ϕ .v ϕ will have a unique out-going edge to v ϕ , the (i + 1)-level node that represents the (i + 1)-cluster ϕ containing ϕ.In addition, each node v ϕ in T will store the size of the cluster it represents.Further, every −1 level node v in T will have a link to each of its ancestors in T (i.e.nodes representing the i ≥ 0 clusters containing v).
0-clusters: are constructed upon −1-clusters (V ).The construction of C 0 is done the same way as for i ≥ 1 (see below), where we start the construction right away from the construction of light clusters. i-clusters: We assume that T is updated.Construct a graph K i with C i−1 as its vertices.We add all the edges of E i to K i (deleting self-loops and keeping only the lightest edge between two clusters).The construction of K i is finished in O(m i ) time (using T ).
The construction of C i is done from K i in two parts.In the first part we construct the heavy clusters.In the beginning all the nodes are unmarked.We now go over all the nodes, ϕ, in K i and consider the following cases: If ϕ has at least d neighbors and both ϕ and all its neighbors are unmarked we create a new i-level heavy cluster φ containing ϕ and all of its neighbors.We mark all the nodes currently in φ, called the origin of φ (additional clusters might be added later).In addition, we add all the (representatives of the) edges between ϕ and its neighbors to E sp .At the end of this procedure, each unmarked node ϕ with at least d neighbors has at least one marked neighbor.We add each such ϕ to a neighboring i-level cluster (via and edge to its origin) and mark ϕ.We also add the corresponding edge to E sp .For every heavy cluster ϕ created so far, we denote all the vertices currently in ϕ as the core of ϕ (additional clusters might be added later during the formation of light clusters).
In the second part we construct the light clusters.We start by adding all the (representatives of the) edges incident to the remaining unmarked nodes to E sp .Let L i be the graph with the remaining unmarked nodes as its vertex set and the edges of the MST going between these nodes (keeping the graph simple) as its edge set.The clustering is similar to the (i, ε) clustering described in Section 4. Iteratively, we pick an arbitrary node ϕ ∈ L i , and grow a cluster around it by joining arbitrary neighbors one at a time.Once the cluster has size at least kg i /c (number of actual vertices from G) we stop and make it an i-level light cluster φ.We call the nodes currently in φ the core of φ.If the cluster has size less than kg i /c and there is no remaining neighboring vertices in L i , we add it to an existing neighboring cluster (heavy or light) via an MST edge to its core (note that this is always possible).We continue doing this until all nodes are part of an i-level cluster.
This finishes the description of the clustering procedure.We are now ready to prove Claim 1.
Proof of Claim 1. Recall the value of our constants: g = 20, c = 24, d = 160.We also assumed that k ≥ 640.
We will prove the claim by induction on i.We start with i = 0. Property (2) of Claim 1 is straightforward from the construction as we used only unit weight edges.For property (1), note that the core of each 0-cluster has diameter at most k c .Each additional part has diameter at most k c − 1 and is connected via unit weight edge to the core.Hence the diameter of each 0-cluster is bounded by 3 • k c < k 2 .Now assume that the claim holds for i − 1 and let C ∈ C i .Assume first that C is a light cluster.From the construction, C contains at least kg i /c vertices.The size of C is larger than the diameter by the induction hypothesis and the fact that we used only unit weight edges to join the light cluster.For the upper bound on the diameter, observe that the diameter of C was at most kg i /c before the last (i − 1)-cluster was added to the core of C. At this point we add the final (i − 1)-cluster, which has diameter at most kg i−1 /2.We conclude that the diameter of the core of C is at most kg i /c + kg i−1 /2.Afterwards, we might add additional parts to C. However, each such part has diameter strictly smaller then kg i /c and are added with a unit weight edge to the core of C. Thus each light cluster C has diameter at most Next, we consider a heavy cluster C. Let C ⊆ C be the set of vertices that belonged to C before the construction of light clusters (i.e. the core of C).Let ϕ be the original (i − 1)-cluster that formed C. Then each (i − 1)-cluster of C is at distance at most 2 from ϕ in K i .Thus, by the induction hypothesis, the diameter of C is at most 5 During the construction of the light clusters we might add some "semi-clusters" to C of diameter strictly smaller then kg i−1 /c via unit weight edges.We conclude that the diameter of C is at most To conclude the first phase we will analyze its running time.Level i clustering is done in O(n i−1 + m i ) time, and updating T takes an additional O(n) time.In total all the first phase takes us O(kn + m) time.
Second phase Recall that we pick µ = log(k/ε) and refer to Algorithm 2 for definitions and details.Here we only analyze the running time.We denote mr = (r+1)µ−1 i=(r−1)µ m i .Creating S 0 (line 19) takes O(m + n log n) times, computing H 0 takes O ε (m + n log n) time (according to Lemma 5).Next we have k µ step loop.For fixed r, we create the vertex set V r (line 21) in O(n (r−1)µ ) time, using T .4Upon V r , we create the graph S r (line 23).This is done by first adding the edges of E r , and all the edges in ∪ (r+1)µ−1 i=(r−1)µ E i .We can maintain E r during the first phase in no additional cost, thus creating S r and modifying the weights will cost us O(n (r−1)µ + mr ).Finally, we compute a spanner H r of S r using Lemma 5 (line 25) in Then we add (the representatives of) the edges in H r into E sp (line 26) in O ε mr + n (r−1)µ time.Thus, the total time invested in creating H r is O ε mr + n (r−1)µ log n .The total time is bounded by

Proof of Theorem 6
We restate the theorem for convenience: Theorem 6.Given a weighted undirected graph G = (V, E, w) with m edges and n vertices, any positive integer k, any > 0, and any positive integer s = O(1), one can deterministically construct a (2k The basic idea in the algorithm of Theorem 6, is to partition the edges E of G into O ε (log k) sets E 1 , E 2 , . . ., such that the edges in E i are "well separated".That is, for every e, e ∈ E i , the ratio between w(e) and w(e ) is either a constant or at least k.By hierarchical execution of a modified version of [HZ96], with appropriate clustering, we show how to efficiently construct a spanner of size O(n 1+1/k ) for each such "well separated" graph.Thus, taking the union of these spanners, Theorem 6 follows.
In Section 7.1 we describe the algorithm.In Section 7.2 we bound the stretch, in Section 7.3 the sparsity, and in Section 7.4 the running time.In Section 7.5 we introduce a relaxed version of the union/find problem (called prophet union/find), and construct a data structure to solve it.The prophet union/find is used in the implementation of our algorithm.

Algorithm
The following is our main building block.The description and the proof can be found in Section 12.1.

Lemma 9 (Modified [HZ96]
).Given an unweighted graph G = (V, E) and a parameter k, Algorithm 7 returns a (2k − 1)-spanner H with O(n 1+1/k ) edges in O(m) time.Moreover, it holds that 1. V is partitioned into sets S 1 , . . ., S R , such that at iteration i of the loop, S i was deleted from V .

When deleting S i , Algorithm 7 adds less then |S
All these edges are either internal to S i or going from S i to ∪ j>i S j .4. There is an index t, such that for every i ≤ t, |S i | ≥ n 1/k , and for every i > t, |S i | = 1 (called singletons).
For simplicity we assume that the minimal weight of an edge in E is 1.Otherwise, we can scale accordingly.Let , and let G j be the subgraph containing the edges ∪ i≥0 E j+i•c l log k .Note that G 0 , . . ., G c l log k−1 partition the edges of G. Next we build a different spanner H j for every G j and set the final spanner to be Fix some j.Set the 0-clusters to be the vertex set V .Similar to the previous sections we will have i-clusters, which are constructed as the union of (i − 1)-clusters.Let G j,i be the unweighted graph with the i-clusters as its vertex set and E j+i•c l log k as its edges (keeping the graph simple).Let H j,i be the (2k − 1)-spanner of G j,i returned by the algorithm of Lemma 9. We add (the representatives) of the edges in H j,i to H j .Based on H j,i we create the (i + 1)-clusters as follows.Let S 1 , . . ., S t , V be the appropriate partition of the vertex set, where S 1 , . . ., S t are non-singletons, and all the singletons are in V .Each S a for a ≤ t becomes a (i + 1)-cluster.Next, for each connected component C in G j,i [V ], we divide C into clusters of size at least k, and diameter at most 3k (in the case where |C| < k we let C be an (i + 1)-cluster).We then proceed to the next iteration.

Stretch
We start by bounding the diameter of the clusters.

Claim 2. Fix j, for every i-cluster ϕ of G
Proof.We show the claim by induction on i.For i = 0, the diameter is 0. For general i, in the unweighted graph G j,i−1 , we created clusters of diameter at most 2k − 2 for the non-singletons and 3k for the singletons.Thus the diameter of φ in H is bounded by the sum of 3k edges in E j+(i−1)•c l log k , and 3k + 1 diameters of (i − 1)-clusters.By the induction hypothesis where the last inequality follows as (1 + ) c l log k ≥ 18k .
The rest of the proof follows by similar arguments as in Equation (1).See Figure 2 for illustration.

Sparsity
Again, we fix some j ≥ 0. We will bound |H j | by O(n 1+1/k ) using a potential function.For a graph G with n G vertices, set potential function P (G ) = 2 • n G • n 1/k .That is, we start with a graph G j,0 with n 0 = n vertices and potential P (G j,0 ) = 2 • n • n 1/k .In step i we considered the graph G j,i .Let m i denote the number of edges added to H j in this step.We will prove that P (G j,i ) − P (G j,i+1 ) ≥ m i to conclude that Let S 1 , . . ., S R be the partition created by Lemma 9, where S 1 , . . ., S t are the non-singletons, and V = ∪ r>t S r are the singletons.Let C 1 , . . ., C R be the connected components in the induced graph G j,i [V ].We will look on the clustering procedure iteratively, and evaluate the change in potential after each contraction.Consider first the non-singletons.Fix some r ≤ t and let X r be the graph after we contract S 1 , . . ., S r (note that X 0 = G j,i ).For r ≥ 0, let mr be the number of edges added to H j,i while creating S r .Recall that mr where the inequality follows as S r is not a singleton.
Next we analyze the singletons.Consider some singleton {v} = S r .Recall that once the algorithm processed S r it only added edges to the spanner from the connected component C r of G j,i [V ] containing v. Furthermore it added at most n 1/k such edges.Instead of analyzing the potential change from deleting S r , we will analyze the change from processing the entire connected component C r .Denote by mr the total number of edges added to the spanner from C r .It holds that mr ≤ |C r | • n 1 k .Let Y r be the graph G j,i where we contract S 1 , . . ., S t , and all the clusters created from C 1 , . . ., C r (note that Y 0 = X t and Y R = G j,i+1 ).Suppose C r is divided into clusters A 1 , . . ., A z .Then we have We prove that P (Y r −1 ) − P (Y r ) ≥ mr by case analysis: (2)

Running Time
We can assume that the number of edges m is at least n 1+1/k log k, as otherwise we can simply return the whole graph as the spanner.Assuming this, dividing the edges into the sets E 0 , E 1 , . . ., and creating the graphs G 0 , . . ., G c l •log k−1 will take us O(m + n log k) = O(m) time (first create log k empty graphs, and then go over the edges, and add each edge to the appropriate graph).Fix j, and set m j to be the number of edges in G j .The creation of H j,i , takes O (|E j+i•c l log k |) time (Lemma 9) which summed over all i is O(m j ).Clustering can be done while constructing H j,i with a union/find data structure.Queries to this data structure are used to identify the clusters containing the endpoints of edges and union operations are used when forming clusters from sub-clusters.However, a union/find data structure will be too slow for our purpose since we seek linear time for almost all choices of k.In the next subsection, we present a variant of the union/find problem called prophet union/find; solving this problem suffices in our setting.With the constant s from Theorem 6, we give a data structure for prophet union/find which for any fixed j spends time O(m j s + n log (s) n) = O(m j + n log (s) n) on all operations.Summed over all j, this is O(m + n log (s) n log k).
We may assume that n log (s) n log k > m since otherwise the time bound simplifies to linear.Since we also assumed m > n 1+1/k log k, we have We conclude that the running time is linear if k ≤ log n/ log (s+1) n.Now, assume k > log n/ log (s+1) n.Then log n < k log (s+1) n < k 2 , implying that log (s) n = O(log (s−1) (k 2 )) = O(log (s−1) k) and we get a time bound of O(m + n log (s−1) k log k), as desired.

Prophet Union/Find
Consider a ground set A = {x 1 , . . ., X n } of n elements, partitioned to clusters C, initially consisting of all the singletons.We need to support two type of operations: find query, where we are given an element x ∈ A and should return the cluster C ∈ C containing it, and union operation, where we are given two elements x, y ∈ A where x ∈ C x , y ∈ C y (C x , C y ∈ C), and where we should delete the clusters C x , C y from C and add a new cluster C x ∪ C y to C. The problem described above is called Union/Find.Tarjan [Tar75] constructed a data structure that processes m union/find operations over a set of n elements in O(m • α(n)) time, where α is the very slow growing inverse Ackermann function.
A trivial solution to the union/find problem will obtain O(m + n log n) running time, which is superior to [Tar75] for m ≥ n log n.Indeed, one can simply store explicitly for each element the name of the current cluster containing it, and given a union operation for x, y ∈ A, where x ∈ A, y ∈ B and w.l.o.g.|A| ≥ |B|, one can simply change the membership of all the elements in B to A. Each find operation will take constant time, while every vertex can update its cluster name at most lg n times (as each time the cluster name is updated, the cluster size is at least doubled).Thus in total, the running time is bounded by O(m + n log n).
We introduce a relaxed version of the union-find problem we call the Prophet union/find.Here we are given a ground set A = {x 1 , . . ., x n } of n elements, and a series q 1 , q 2 , . . ., q m ∈ A m of element queries known in advance.Then we are asked these previously provided set of queries, with union operations intertwined between the find operations.While the union operations are unknown in advance, they are of a restricted form: a union operation arriving after the query q j , must be of the form {q j−1 , q j }, that is a union of the clusters containing the two last find query elements.For a parameter s, we solve the Prophet union/find problem in O(m • s + n log (s) n) time.
Theorem 8.For any s ≤ log * n, a series of m operations in the Prophet union/find problem over a ground set of n elements can be performed in O(m • s + n log (s) n) time.
Proof.Set α 0 = n, α s = 1, and for i ∈ {1, . . ., s − 1}, set α i = log (i) n.We will execute a modified version of the trivial algorithm described above.Specifically, at any point in time, we maintain a set A of elements partitioned to clusters C, where for each element we will store the name of the cluster it currently belongs to, and for each cluster we will store the number of elements it contains.Initially we are given a list q 1 , . . ., q m of queries, which we will store as well.Further, for each element x ∈ A, we will store a linked list containing the indices of all the queries q j such that q j = x.Note that this prepossessing step is done in O(m) time.
Given a find query q j , which is simply a name of an element, we will return in O(1) time the stored cluster name.Given a union operation arriving after the j'th query, we know that it is between the clusters C j−1 , C j containing the elements q j−1 and q j accordingly.We find the clusters and their sizes in O(1) time.Assume w.l.o.g. that There are two cases: we proceed as the trivial algorithm.Specifically, we go over the elements of C j−1 , update their cluster to be C j , and update the size of the cluster C j to be 2. Else, we have |C j−1 | + |C j | ≥ α t ; in this case, replace all the elements C j−1 ∪ C j in A by a new element y.Specifically, add a new element y to A that will belong to a singleton cluster C y = {y} the size of which will be updated set to be Then make a linked list of queries for y by concatenating the linked lists of the elements in Finally, use the newly created linked list to go over all the find queries q j , . . ., q m , and replace every appearance of an element from C j−1 ∪ C j with y.Note that the we now have a valid preprocessed instance of the prophet union/find problem.
We finish with time analysis of the execution of the algorithm.Note that every find operation takes O(1) time, as we explicitly store all the queries and their answers.There are two types of executions of the union operation above.Denote by A t all the artificial elements created during the execution of the algorithm such that the number of ground elements they are replacing is in [α t , α t−1 ).Then |A t | ≤ n αt .Each element y ∈ A t actively participated (that is makes any changes) in at most log α t−1 union operations of the first type.This is because each time this happens, the size of the cluster containing y is (at least) doubled, and once it reaches the size of α t−1 , a union operation of the second type will occur, and y will be deleted.Note that processing y in each such union operation takes only O(1) time (updating the name of the cluster it belongs to, and updating the size of the cluster).Thus it total, the time consumed by all the union operations of the first type is bounded by To bound the time consumed by union operations of the second type, note that each ground element x ∈ A = A s , can go over at most s such transitions (implicitly).For every query q j that initially was to x, we will pay O(1) for each such transition (update the query and the linked list), and thus O(s) overall.We conclude that all the changes due to the second type union operations consume at most O(m • s) time.The theorem now follows.

Deterministic Incremental Distance Oracles for Small Distances
In this section, we present a deterministic incremental approximate distance oracle which can answer approximate distance queries between vertex pairs whose actual distance is below some threshold parameter d.This oracle will give us Theorem 5 and finish the proof of Theorem 4.
In fact, we will show the following more general result.Theorem 5 follows directly by setting k = 1/ε in the theorem below.
Theorem 9. Let G = (V, E) be an n-vertex undirected graph that undergoes a series of edge insertions.Let G have positive integer edge weights and set E = ∅ initially.Let ε > 0 and positive integers k and d be given.Then a deterministic approximate distance oracle for G can be maintained under any sequence of operations consisting of edge insertions and approximate distance queries.Its total update time is where m is the total number of edge insertions; the value of m does not need to be specified to the oracle in advance.Given a query vertex pair (u, v), the oracle outputs in As discussed in Section 3, a main advantage of our oracle is that, unlike, e.g., the incremental oracle of Roditty and Zwick [RZ12], it works against an adaptive adversary.Hence, the sequence of edge insertions does not need to be fixed in advance and we allow the answer to a distance query to affect the future sequence of insertions.This is crucial for our application since the sequence of edges inserted into our approximate greedy spanner depends on the answers to the distance queries.
We assume in the following that m ≥ n; if this is not the case, we simply extend the sequence of updates with n − m dummy updates.We will present an oracle satisfying Theorem 9 except that we require it to be given m in advance.An oracle without this requirement can be obtained from this as follows.Initially, an oracle is set up with m = n.Whenever the number of edge insertions exceeds m, m is doubled and a new oracle with this new value of m replaces the old oracle and the sequence of edge insertions for the old oracle are applied to the new oracle.By a geometric sums argument, the total update time for the final oracle dominates the time for all the previous oracles.Hence, presenting an oracle that knows m in advance suffices to show the theorem.
Before describing our oracle, we need some definitions and notation.For an edge-weighted tree T rooted at a vertex u, let d T (v) denote the distance from u to v in T , where ).For a vertex u in an edge-weighted graph H and a value r ≥ 0, we let B H (u, r) denote the ball with center u and radius r in We use a superscript (t) to denote a dynamic object (such as a graph or edge set) or variable just after the t'th edge insertion where t = 0 refers to the object prior to the first insertion and t = m refers to the object after the final insertion.For instance, we refer to G just after the t'th update as G (t) .
In the following, let ε, k, and d be the values and let G = (V, E) be the dynamic graph of Theorem 9.For each i ∈ {0, . . ., k − 1}, define m i = 2m (i+1)/k and let d i be the smallest integer power of (1 + ε) of value at least (3 + 2ε) i d.For each u ∈ V and each t ∈ {0, . . ., m}, let d (t) i (u) be the largest integer power of (1 + ε) of value at most d i such that deg i (u) need not be uniquely defined; in the following, when we say that a tree is equal to T (t) i , it means that the tree is equal to some shortest path tree from u in B (t) i (u).The data structure in the following lemma will be used as black box in our distance oracle.One of its tasks is to efficiently maintain trees T (t) i (u).
Lemma 10.Let U ⊆ V be a dynamic set with U (0) = ∅ and let i ∈ {0, . . ., k − 1} be given.There is a deterministic dynamic data structure which supports any sequence of update operations, each of which is one of the following types: Let t max denote the total number of operations and for each vertex u inserted into U , let t u denote the update in which this happens.The data structure outputs in each update t ∈ {1, . . ., t max } a (possibly empty) set of trees At any point, the data structure supports in O(1) time a query for the value d Proof.We assume in the following that each vertex of V has been assigned a unique label from the set {0, . . ., n − 1}.
In the following, fix a vertex u ∈ V such that t u exists, i.e., update t u is the operation Insert-Vertex(u).Before proving the lemma, we describe a data structure D u which maintains the following for each t ∈ {t u , . . ., m}: a tree T (t) (u) rooted at u, a distance threshold d (t) (u), and distances d T (t) (u) (v) for all v ∈ V (T (t) (u)).We will show that D u maintains the following two properties: i (u) for all t ∈ {t u , . . ., m}, 2. in each update t ∈ {t u , . . ., m} where either t > t u and d In all other updates, no tree is output.
After any update t, D u supports in O(1) time a query for the value d i (u) (v) and for whether a given vertex v ∈ V belongs to V (T i (u)).D u maintains a tree T (t) (u) rooted at u as well as a distance threshold d (t) (u); to simplify notation, we shall write T (t) instead of T (t) (u) and d (t) instead of d (t) (u).Later we show that i (u).The tree T (t) is maintained by keeping a predecessor pointer for each vertex to its parent (with u having a nil pointer) and where each vertex v ∈ T (t) is associated with its distance d T (t) (v) from the root u.
Since d (t) is maintained explicitly by D u and since i (u), it follows that D u can answer a query for the value d time.To answer the other two types of queries, D u maintains V (T (t) ) as a red-black tree keyed by vertex labels; this clearly allows both types of queries to be answered in O(log n) time.
Handling the update t = t u for D u : For the initial update t = t u , a tree T (t) i (u) is computed by running Dijkstra's algorithm from u in G (t) with the following modifications: 1. the priority queue initially contains only u with estimate 0; all other vertices implicitly have an estimate of ∞, 2. a vertex is only added to the priority queue if a relax operation caused its distance estimate to be strictly decreased to a value of at most d i , 3. the algorithm stops when the priority queue is empty or as soon as deg If the algorithm emptied its priority queue, D u sets T (t) ← T (t) i (u) and d (t) ← d i , finishing the update.Now, assume that the algorithm did not empty its priority queue and let v max denote the last vertex added to Then it obtains T (t) as the subtree of Handling updates t > t u for D u : Next, consider update t > t u .D u ignores updates to U so we assume that update t is of the form Insert-Edge(e (t) ).Assume that D u has obtained T (t−1) and d (t−1) in the previous update.To obtain T (t) and d (t) , D u regards e (t)  as two oppositely directed edges.Note that for at most one of these edges (v 2 ).If no such edge exists, D u sets T (t) ← T (t−1) and d (t) ← d (t−1) , finishing the update.Otherwise, D u applies a variant of Dijkstra's algorithm.During initialization, this variant sets, for each vertex v ∈ V (T (t−1) ), the starting estimate of v to d T (t−1) (v) and sets its predecessor to be the parent of v in T (t−1) ; all other vertices implicitly have an estimate of ∞.The priority queue is initially empty.In the last part of the initialization step, the edge (v 2 ) is relaxed.The rest of the algorithm differs from the normal Dijkstra algorithm in the following way: 1. a vertex v is only added to the priority queue if a relax operation caused the estimate for v to be strictly decreased to a value of at most d (t−1) , 2. the algorithm stops when the priority queue is empty or as soon as the the total degree in G (t) of vertices belonging to the current tree found by the algorithm exceeds m i .

Let T (t)
i (u) be the tree found by the Dijkstra variant.If the priority queue is empty at this point, D u sets T (t) ← T (t) i (u) and d (t) ← d (t−1) .Otherwise, D u computes T (t) and d (t) in exactly the same manner as in the case above where t = t u and where the priority queue was not emptied; finally, D u outputs Properties of D u : We now show the two properties of D u mentioned earlier.We repeat them here for convenience: i (u) for all t ∈ {t u , . . ., m}, 2. in each update t ∈ {t u , . . ., m} where either t > t u and d In all other updates, no tree is output.
The first property is shown by induction on t ≥ t u .This is clear when t = t u so assume in the following that t > t u , that T (t−1) = T (t−1) i (u) and d (t−1) = d (t−1) i (u), and that update t is an operation Insert-Edge(e (t) ).The first property will follow if we can show that If the Dijkstra variant was not executed then no edge was relaxed which implies that i (u), as desired.Otherwise, consider first the case where d i (u) so the priority queue of the Dijkstra variant must be empty at the end of update t.Combining this with the observation that any vertex v whose distance from u in G (t) is smaller than in G (t−1) must be on a u-to−v path containing e (t) , it follows that the Dijkstra variant computes i (u), as desired.For the case where d , the priority queue of the Dijkstra variant is not emptied; it follows by definition of To show that D u satisfies the second of the properties above, consider an update t ≥ t u .Assume first that t = t u .Then is not output, and since d (t) < d i (u) < d i , then Dijkstra's algorithm stopped without emptying its priority queue which implies that deg The case t > t u is quite similar.We may assume that this update inserts e (t) into G (t) .If d (u) then as shown above, no tree is output in update t.Now, assume that Then the Dijkstra variant did not empty its priority queue so it outputs tree (u), the same argument as in the case where t = t u gives r(T This shows the second of the two properties for D u mentioned above.
Bounding update time of D u : We now bound the update time for D u where we ignore the cost of updates t where e (t) is not incident to T (t−1) i (u); when we use D u in the final data structure D below, D will ensure that Insert-Edge will only be applied to edges if they are incident to T (t−1) i (u) and we show that this suffices to ensure the two properties of D u .Consider an update t u .Observe that our two Dijkstra variants (the one described in the case t = t u and the one described in the case t > t u ) are terminated as soon as the total degree in G (t)  of vertices extracted from the priority queue exceeds m i .Ignoring the cost of the initialization step of the second variant, it follows from a standard analysis of Dijkstra's algorithm that both variants run in time O(m i log n).To bound the time for the initialization step of the second variant, note that the desired starting estimates and predecessor pointers are present in T (t−1) and this tree is available at the beginning of the update.Hence, the work done in the initialization step thus only involves relaxing a single edge (v 2 ).With this implementation, the cost of the initialization step does not dominate the total cost of the update.
The number of updates t > t u for which d Since the Dijkstra variant only adds a vertex of T (t−1) i (u) to the priority queue if the distance estimate of the vertex is strictly decreased, we can charge the running time cost of the Dijkstra variant to deg 2 ) log n.In the following, we bound deg 2 ) log n separately over all t ∈ {t 1 + 1, t 1 + 2, . . ., t 2 } and all maximal ranges {t 1 , t 1 + 1, . . ., t 2 }.
We conclude that D u requires time O ε (m i log n(d i + i + log d)) = O ε (m i d i log n) over all updates t consisting of the insertion of an edge e (t) which is incident to The final data structure: We have shown that D u satisfies the two properties stated at the beginning of the proof and that the total update time over updates t for which e (t) is incident to We are now ready to give a data structure D satisfying the lemma.
Initially, D sets U (0) = ∅.If update t is an operation of the form Insert-Vertex(u), D initializes a new structure D u .For each v ∈ V , D keeps the set U (t) (v) of those vertices u ∈ U (t)  for which v belongs to the tree T (t) i (u) maintained by D u .This set is implemented with a red-black tree keyed by vertex labels.We extend each data structure D u so that when a vertex v joins (resp.leaves) T (t) i (u), u joins (resp.leaves) U (t) (v).This can be done without affecting the update time bound obtained for D u above.
If an update t is of the form Insert-Edge(e (t) ) where e (t) = (v 2 ), D identifies the set 2 ) and updates D u with the insertion of e (t) for each u in this set.This suffices to correctly maintain all data structures D u since for each u ∈ U (t−1) \ U (t−1) (v 2 ), we have T Answering a query for values d T i (u) (v) or d i (u) or for whether v ∈ V (T i (u)) is done by querying D u .Since D u can be identified in O(1) time, the query time bounds for D match those for D u .This completes the proof.

The distance oracle
We are now ready to present our incremental distance oracle.Pseudocode for the preprocessing step is done with the procedure Initialize(V, k) in Algorithm 3. Inserting an edge (v 1 , v 2 ) with integer weight w > 0 is done with the procedure Insert(v 1 , v 2 , w(v 1 , v 2 )) in Algorithm 4 and a query for the approximate distance between two vertices u and v is done with the procedure Query(u, v) in Algorithm 5.
The high level intuition of our construction is that we maintain increasingly smaller subsets of vertex sets denoted A i , where A 0 is the entire vertex set V ; see Figure 3.For each vertex v, we grow a ball up to a threshold size, and we let the centers of a maximal set of disjoint balls be promoted to the next level A i+1 , where the same procedure happens.An implication is that A i+1 is much smaller than A i and we can thus afford to grow larger balls as i grows, i.e. we let the ball threshold size grow with i.
In order to bound stretch, we need balls to have roughly the same radius.To ensure this, we partition balls centered at vertices of A i into classes such that balls in the jth class all have radius within a constant factor of (1 + ε) j .For each class, we keep a maximal set of disjoint balls as described above and A i+1 is the union of centers of these balls over all classes.In class j, each vertex v, which is the center of a ball not belonging to this maximal set points to a representative vertex n i,j (v).This representative vertex is picked in the intersection with another ball in class j centered at a vertex of A i+1 .Every vertex w in this other ball has a pointer r i,j (w) to the center.These pointers are used as navigation in the distance query algorithm when identifying a vertex u i+1 ∈ A i+1 from a vertex u i ∈ A i ; see Figure 3.The fact that the two balls centered at u i resp.u i+1 have roughly the same radius is important to ensure that the stretch only grows by at most a constant factor in each iteration of the query algorithm.
Algorithm 3: Initialize input : V, k 1 A 0 ← V 2 Initialize D 0 as an instance of the data structure of Lemma 10 Initialize D i as an instance of the data structure of Lemma 10 Associate with each v ∈ V uninitialized variables n i,j (v) and r i,j (v) The following lemmas are crucial when we bound update and query time as well as stretch.For i = 0, . . ., k − 1 and j = 1, . . ., log 1+ε d i , let T i,j be the dynamic set of trees T i,j (u) obtained so far for which the test in line 11 of Insert succeeded.Note that for any j in line 9 of Insert, Figure 3: A high-level overview of the distance oracle construction.The vertices v 1 , . . ., v 5 are centers of disjoint (grey) balls and are thus promoted to A i+1 , while W i,j is the union over the vertices of the disjoint balls.The grey balls have radius roughly (1 + ε) j , and we keep a set of balls for every j ∈ {1, . . ., log 1+ε d i }.A query from a center u i of a non-disjoint ball has an assosicated vertex w in an intersecting grey ball, which in turn has a pointer to the ball center u i+1 = r i,j (w).
Proof.For every u added to U i+1 in procedure Insert(v 1 , v 2 , w(v 1 , v 2 )), V (T i,j (u)) ∩ W i,j = ∅ just before the update in line 12 and line 12 is the only place where W i,j is updated.All vertices of U i+1 are added to A i+1 in line 5 of iteration i + 1 and this is the only place where A i+1 is updated.
Lemma 12.After each update, Proof.The lemma is clear for i = 0 since |A 0 | = n ≤ m.Now, let i ∈ {1, . . ., k − 2} be given.We will bound |A i+1 |.Consider any j ∈ {1, . . ., log 1+ε d i }.Since the total degree of vertices in G is at most 2m and since the sets V (T i,j (u)) are pairwise disjoint for all T i,j (u) ∈ T i,j by Lemma 11, it follows from Lemma 10 that the number of roots of these trees is less than 2m/m i .Lemma 11 then implies that . This shows the lemma.
Proof.By Lemma 10, since d i (u) < d i , there must have been some update to D i that output a tree T i (u); consider the last such tree.Then d i (u) has not changed since then and so T i,j (u) must be that tree.Since u / ∈ A i+1 , we must have V (T i,j (u)) ∩ W i,j = ∅ so w ∈ V (T i,j (u)) ∩ W i,j .Since w ∈ W i,j , we have w ∈ V (T i,j (r i,j (w)).At some point, r i,j (w) was added to U i+1 and hence to A i+1 .Since vertices are never removed from A i+1 , the lemma follows.

Bounding time and stretch
After replacing ε with ε/2, the following lemma gives the update time bound claimed in Theorem 9.
Proof.It is easy to see that each execution of lines 9 to 17 in procedure Insert can be implemented to run in time O(|V (T i (u))|).Hence, the total time spent in all calls to Insert is dominated by the total update time of data structures D i , for i = 0, . . ., k − 1.Note that after each update, A i is the current set of vertices added to D i .Letting A (tmax) i be the set A i after the last update, it follows from Lemmas 10 and 12 that D 0 , . . ., D k−1 have total update time where the last bound follows from a geometric sums argument.Finally, we bound query time and stretch with the following lemma; replacing ε with ε/2 gives the bounds of Theorem 9.
Proof.To bound the stretch, we will first show the following loop invariant: at the beginning of the ith execution of the for-loop of procedure Query(u, v), u i ∈ A i and s i ≤ ((3+2ε) i −1)d G (u, v).This is clear when i = 0 so assume that i > 0 and that the loop invariant holds at the beginning of the ith iteration.We need to show that if the beginning of the (i + 1)th iteration is reached, the loop invariant also holds at this point.
We may assume that the tests in lines 4 and 6 fail since otherwise, the (i + 1)th iteration is never reached.If u i ∈ A i+1 then at the beginning of the (i + 1)th iteration, we have Since the tests in lines 4 and 6 fail, we have We can now show the stretch bounds.First observe that m k−1 = 2m.Since at any time, the total degree of vertices in G is at most 2m, it follows that d k−1 (u) = d k−1 for all u ∈ V .Hence, Query outputs a value in some iteration.
The bound Next, we give the upper bound on stretch.If the test in line 4 succeeds in some iteration i, it follows from the the loop invariant that as desired.Now, assume that the test in line 4 fails in some iteration i, i.e., assume that v /

A note on the lightness of other spanners
To motivate the problem of computing light spanners efficiently, we will in this section consider notable spanner constructions and show that they do not provide light spanners.More precisely, we consider the three celebrated spanner constructions of Baswana and Sen [BS07], Roditty and Zwick [RZ11], and Thorup and Zwick [TZ05], respectively, and we show that they do not provide light spanners.
We first consider the algorithm from [RZ11].This algorithm creates a spanner by considering the edges in non-decreasing order by weight similar to the greedy algorithm.It maintains an incremental distance oracle of an unweighted version of the spanner, and adds an edge (u, v), if there is no path between u and v of at most 2k − 1 edges.Consider now running this algorithm on the graph of Figure 4 consisting of a cycle of n = 2k + 1 edges where 2k of them have weight 1 and the last has an arbitrarily large weight W .In this case the algorithm of [RZ11] would add every edge to the spanner, since u and v are only connected by a path of length 2k + 1 when the edge (u, v) is considered (disregarding the weight of (u, v)).This gives us a lightness of Ω(W/n).Since W can be arbitrarily large it follows that no guarantee in terms of k and n can be given on the lightness.
A key part of the algorithm of [BS07] is to arrange the vertices in and clustering the vertices of each layer.Each layer is formed by randomly sampling the clusters of the previous layer with probability n −1/k .Consider a vertex w and let A i be the first layer where w is not sampled.If w is not adjacent to any cluster in A i , then the smallest-weight edge from w to each of the clusters of A i−1 is added to the spanner.Thus, in the example of Figure 4, if neither u nor its neighbours are sampled, then the edge (u, v) is added to the spanner.This happens with probability at least (1 − n −1/k ) 3 and thus we cannot even give a guarantee on the expected lightness of the spanner, as W could be very large compared to this probability.

The spanner of [TZ05] also creates sets of vertices
where each A i is formed by sampling the vertices of A i−1 independently with probability n −1/k .For each vertex of v ∈ (A i \ A i+1 ) they define the cluster of w to be the set of all vertices in V which are closer to w than to any vertex in A i+1 .The spanner they construct is simply the union of the shortest path trees of each cluster with root in w.In particular, for the vertices w ∈ A k−1 we include the shortest path tree of the entire graph with root in w.We wish to show that at least one of these shortest path trees have lightness Ω(n) with constant probability.To see this consider the graph of Figure 5.In this graph we have a complete graph K on n/2 vertices with weights 1 and a cycle C on n/2 vertices with weights 1.For each vertex u ∈ K and each vertex v ∈ C there is an edge (u, v) of large weight W . Clearly the weight of the MST for this graph is W + n − 2, however the shortest path tree from any vertex u ∈ K has weight nW/2 + n/2 − 1 = Ω(nW ).Since we expect half of the vertices of A k−1 to be from K we see that the spanner has expected lightness at least Ω(|A k−1 | • n) = Ω(n 1+1/k ) in this graph.We also note that no edge of the spanner can have weight larger than that of the MST.This follows because every edge of the spanner is part of some shortest-path tree and if its weights was larger, we could simply replace it in the shortest-path tree by the entire MST.Thus the lightness is also bounded from above by O(kn 1+1/k ).
Figure 5: Example of a bad input graph for the spanner of [TZ05].K is the complete graph on n 1/k vertices, where every edge has weight 1.This bad instance implies Ω(n 1+1/k ) lightness for [TZ05].

Proof of Lemma 5
We build upon (the first variant of) the algorithm from [ES16], while we get an improved bound using the assumption of the small aspect ratio.The basic component of the algorithm is the spanner of Halperin and Zwick (see Theorem 10).For simplicity we will assume that a = 1.The construction/proof stays the same for general a.
Fix ρ = 1 + .We start by computing the MST T .We divide the edges into log ρ ∆ buckets.For j ∈ [1, log ρ ∆], let E j = e ∈ E | w(e) ∈ [ρ j−1 , ρ j ) \ T .We will construct separate spanner for each bucket.We will use the (i, ε 4 )-clustering as described in Section 4. That is, for every j, we will have set of at most n j = 4n ερ j−1 cluster, each with diameter bounded by ερ j−1 (in the MST metric).Then, for each j, we contract each cluster and construct an unweighted graph G j with clusters as its vertices, where there is an edge between two clusters ϕ v , ϕ u , iff there are vertices v ∈ ϕ v and u ∈ ϕ u such that (u, v) ∈ E j .Next we will construct a spanner H j for G j using Theorem 10.For each edge ẽ ∈ H j we will add the edge e ∈ E j that created ẽ to our final spanner H (if there multiple such edges, we add an arbitrary one).Our final spanner H contains the MST edges and the representatives of all the edges in j H j .
Stretch As the diameter of every j-cluster is only an ε fraction of the weight of edges in E j , bound on the stretch proof follows by similar arguments as in Equation (1).See Figure 2 for illustration.

Number of edges by Theorem 10
Lightness as all the edges in H j have weight at most ρ j ,

Missing proofs from the analysis 11.1 Stretch
In this section we bound the stretch of the spanner constructed in Algorithm 2 by (1+O(ε))(2k−1).Consider some edge (u, v) = e ∈ E. If w(e) ≤ k ε = g µ , then e is treated by H 0 , the spanner constructed in line 19 of Algorithm 2. Otherwise, let i ≥ µ and r ≥ 1 be such that w(e) ∈ [g i , g i+1 ) ⊆ [g rµ , g (r+1)µ ).For any j, let ϕ j v (resp.ϕ j u ) denote the j-level clusters containing v (resp.u). If w(e) and we are done.Otherwise, if ϕ i v or ϕ i u are light i-clusters, then during the first phase, we add an edge e (of weight at most w(e)) between ϕ i−1 v and ϕ i−1 u .In particular d Esp (v, u) ≤ diam Esp (ϕ i−1 v ) + w(e ) + diam Esp (ϕ i−1 u ) ≤ kg i−1 2 + w(e) + kg i−1 2 ≤ (k/g + 1)w(e) .
Finally consider the case where ϕ i v and ϕ i u are heavy i-clusters.Recall the auxiliary graph S r constructed during the second phase.Its vertices were V (r−1)µ .In particular it contained an edge e from ϕ , where w(e ) ≤ w(e).Note that the diameter of each (r − 1)µ cluster is bounded by k•g (r−1)µ 2 = ε 2 g rµ , while in the used modified weight function w r (e ) the minimal weight is g rµ .Following similar arguments to those in Equation (1) there is a path in E sp of length (1 + O( ))(2k − 1) • w(e) from v to u. See Figure 2 for illustration.

Proofs of Lemma 7 and Lemma 8
For i-level cluster C (heavy or light), set diam(C) to be the maximum value between the diameter (in H) of the cluster C (in the time it was created) and 1 c kg i .We start with proving some properties of the clusters: Claim 3. Let C be an i-level heavy cluster.Let C be the set of the i − 1 clusters contained in C, Proof.By the definition of diam, and Claim 1 Proof.This is straightforward as the cluster C was created from C using only MST unit weight edges.
Claim 5. Let C be an i cluster and C be the set of the j clusters contained in C for some j < i.We now ready to prove Lemma 7.
Proof of Lemma 7. Recall that we used modified weights w r (e) = max kg (r−1)µ / , w(e) .The C i (i.e.heavy clusters that does not contained in any other heavy cluster up to level (r + 1) µ).Note that H r form a partition of V r .We will call the sets in H r bugs.We will construct a spanning tree T of S r .Trivially, w(T ) is upper bound on w(M r ).T will consist of spanning tree T C for every C ∈ H r , and in addition a set of cross-bug edges T .
First consider C ∈ H r .Let C C be all the (r − 1) µ clusters contained in C. By Claim 5, there is a spanning tree T C of weight O C ∈C C diam(C ) that connects between all the clusters in C. Note that all the edges in T C contained in E r , and thus in S r .
Next, let T be a set of edges between bugs of maximal cardinality, such that there is no cycles in T ∪ C∈Hr T C .Set T = T ∪ C∈Hr T C , note that T is a spanning forest of S r .As each C ∈ H r is already connected, necessarily |T | ≤ |H r | − 1.The weight of each edge e ∈ T , is at most g (r+1)µ = kg µr /ε, while for every C ∈ H r , diam(C) ≥ Let V r be all the (r − 1) µ clusters contained in C. For i > (r − 1) µ By applying this on all the maximal heavy clusters and get the claim.Now we ready to prove Lemma 8.
Proof of Lemma 8. Fix some r.Note that the minimal weight of an edge in S r is kg (r−1)µ / , while by Lemma 7, w r (M r ) ≤ O |V r | • kg (r−1)µ / .Using Lemma 5,

Modified [HZ96] Spanner
Algorithm 6 picks an arbitrary vertex in line 3 and grow a ball around it.Our spanner in Theorem 6 uses Algorithm 6 as sub-procedure.However we will need additional property from the spanner.Specifically, we will prefer to pick a vertex with at least n 1 k − 1 active neighbors.The modified algorithm presented in Algorithm 7. We denote by deg G (v), the degree of v in G .V is partitioned into sets S 1 , . . ., S R , such that at iteration i of the loop, S i was deleted from V .

When deleting S i , Algorithm 7 adds less then |S i | • n
1 k edges.All these edges are either internal to S i or going from S i to ∪ j>i S j .4. There is an index t, such that for every i ≤ t, |S i | ≥ n 1/k , and for every i > t, |S i | = 1 (called singletons).
Proof.The stretch and sparsity follows from Theorem 10 as we only specify (the prior arbitrary) order of choosing vertices in line 3. Property 2 follows as the radius chosen in line 4 bounded by k − 1. Properties 1,3,4 are straightforward from line 3 of Algorithm 7. Thus we only need to bound the running time.
It will be enough to provide an efficient way to pick vertices in line 3.We will maintain deg(v) for every vertex v, and a set A of all the vertices with degree at least n 1 k .The degrees are computed in the beginning of the algorithm, and all the relevant vertices inserted to A. Then, in iteration i, after deleting S i , we go over each deleted vertex, decrease the degree of each neighboring vertex, and update A accordingly.Using A, the decision in line 3 can be executed in constant time.The maintenance of A and the degrees can be done in O(m) time, as we refer to each edge at most constant number of times.

Figure 2 :
Figure2: e = (u, v) is an edge (colored in blue) in E i , such that u and v belong to the i-clusters ϕ u , ϕ v , respectively.The closed bold black curves represent i-clusters.The red edges represent edges in H i .The thin black curves represent MST paths.There is an edge e between ϕ v to ϕ u in G i .Therefore H i contains a short path between ϕ v to ϕ u .

Lemma 3 .
Let G = (V, E, w) be an edge-weighted graph with m = |E| and n = |V |.Let k be a positive integer and let ε > 0 be a constant.Then one can deterministically construct an O(k)-spanner of G with size O(n 1+1/k ) in time O m + kn 1+1/k+ε .

Lemma 4 .
Let G = (V, E, w) be an undirected graph with m = |E| and n = |V |, edge weights bounded from above by W and where all MST edges have weight 1.Let k be a positive integer.Then one can deterministically construct a

Algorithm 2 :1
A 1 component of Theorem 3 input : Parameters k, ε, weighted graph G = (V, E, w) where all MST edges have unit weight and max e∈E w(e) ≤ g k .output : Spanner E sp .Fix g = 20, c = 24, d = 160 and µ

(
r+1)µ−1 i=(r−1)µ E i as its edges (keeping S r simple) Let w r (e) = max{w(e), kg (r−1)µ /ε} be the weight function of S r Construct a (2k − 1)(1 + )-spanner H r of S r using Lemma 5 Add H r to E sp 28 return E sp Lemma 7. The MSF M r of S r has weight w r ( (t) i (u) and in O(log n) time a query for the value d T (t) i (u) (v) and for whether v ∈ V (T (t) i (u)), for any query vertices u ∈ U and v ∈ V .

i
(u)  and in O(log n) time a query for the value d T (t) (t) i (u) = T (t−1) i (u) and d (t) i (u) = d (t−1) i (u), implying that D u need not be updated.Hence, D handles updates in the way stated in the lemma and has total update time O(m) + O ε (|U (tmax) |m i d i log n), as desired.

Figure 4 :
Figure 4: Example of a bad input graph to the algorithms of [BS07] and [RZ11]: A cycle of 2k + 1 edges with one very heavy edge.This bad instance implies Ω(W ) lightness for both algorithms.
Thus the lightness bounded by O (n1 k • log ∆).Running time Computing the MST takes O(n log n) times.Following the analysis of Section 4, the construction of the vertices for all the graphs G 1 , . . ., G log ρ ∆ will take Olog ρ ∆ j=1 |n j | = O n log ρ ∆ j=0 1 ρ j = O ε (n) time.Adding the edges to the graphs will take additional O(m + n log n) time.Computing the spanners H j (using Theorem 10) takes j O(|E j |) = O(m) time.All in all, a total of O (m + n log n) time.

Claim 4 .
Let C be an i-light cluster.Let C be the set of the i − 1 clusters contained in C, then diam(C) ≤ C ∈C diam(C ) + |C| − 1.
Consider the graph G [C] where we contract all the j-clusters and keep only the edges used to create clusters.Then w (M ST (G [C])) = O C ∈C diam(C ) .Proof.Denote by C r the set of r-level clusters contained in C. Let E r be the set of edges used to create the clusters C r+1 from C r .Note that |E r | < |C r |, and that the weight of e ∈ E r is bounded by g r+2 .Moreover, E = ∪ i r=j+1 E r spans G [C], and thus we can bound w (M ST (G [C])) by w(E ).It holds that c • g 2 k • diam(C ) .By Claim 3 and Claim 4, C ∈Cr diam(C ) = O C ∈C j diam(C ) .We conclude w E ≤ O contribution of this change to the weight of M r , bounded by (|V r | − 1) kg (r−1)µ / .Thus we can ignore it, and bound w(M r ) (original weight) instead of w r (M r ) (modified weight).Denote by C i the set of i-level clusters.Let H r be the set of maximal heavy clusters in (r+1)µ−1 i=rµ kg rµ c .Hence w(T ) ≤ |H r | • k • g µr ≤ c • C∈Hr diam(C).Using Claim 3 and Claim 4 w(T ) ≤ w T + Define a potential function D i = ϕ∈C i diam(ϕ) + |C i |.According to Claim 4, and Claim 3, D i is not-increasing.Claim 6.For every r ≥ 1, D (r−1)µ − D (r+1)µ = Ω |V r | • kg (r−1)µ .Proof.Consider some i-level heavy cluster C. Let C be all the i − 1 clusters contained in C. Let D be the potential function on the graph induced by C. Then by Claim 3,
Retain only edges of weight [g i , g i+1 ) (keeping K i simple).
6Let all nodes of K i be unmarked7 for ϕ ∈ K i do 8 if deg(ϕ) ≥ d,ϕ is unmarked, and all of ϕ's neighbours are unmarked then 9 Create new heavy cluster φ with ϕ and all neighbours Mark all nodes of φ for ϕ ∈ K i do if deg(ϕ) ≥ d and ϕ is unmarked then /* ϕ must then have marked neighbour.*/ Add ϕ to the heavy cluster of a marked neighbour Mark all clustered, unmarked vertices ϕ Add all edges used to join heavy clusters to E sp /* Construct i-level light clusters */ Add all edges incident to unmarked nodes to E sp Join remaining nodes into clusters of size (number of original vertices) ≥ 1 c • kg i and diameter ≤ 1 2 • kg i using MST edges If a cluster cannot reach 1 c • kg i nodes.Add it to a neighbouring heavy cluster (via MST edge) /* Second phase: is at most log 1+ε d i = O ε (i+log d).As shown above, the time spent in each such update is O(m i log n) which over all such updates is O ε (m i log n(i + log d)) time.Now, consider a maximal range of updates {t 1 , t 1 + 1 . . ., t 2 } ⊆ {t u , t u + 1, . . ., t max } where (u) and consider an update t ∈ {t 1 + 1, t 1 + 2, . . ., t 2 }.Assuming that the Dijkstra variant is executed, it must empty its priority queue in this update.Let V i line 6 is thus valid.It remains to bound query time.Consider any iteration i.By Lemma 10, performing the tests in lines 4 and 6 and computing distances in line 15 can be done in O(log n) time.Over all iterations, this is O(k log n).