Abstract 1 Introduction 2 Preliminaries 3 Sourcewise Approximate Distance Oracle for a Single Edge Fault 4 A Sparser Fault-Tolerant Sourcewise Approximate Distance Oracle 5 Concluding Remarks References

Fault-Tolerant Approximate Distance Oracles with a Source Set

Dipan Dey ORCID Tata Institute of Fundamental Research, Mumbai, India Telikepalli Kavitha ORCID Tata Institute of Fundamental Research, Mumbai, India
Abstract

Our input is an undirected weighted graph G=(V,E) on n vertices along with a source set SV. The problem is to preprocess G and build a compact data structure such that upon query Qu(s,v,f) where (s,v)S×V and f is any faulty edge, we can quickly find a good estimate (i.e., within a small multiplicative stretch) of the s-v distance in Gf. We use a fault-tolerant ST-distance oracle from the work of Bilò et al. (STACS 2018) to construct an S×V approximate distance oracle or sourcewise approximate distance oracle of size O~(|S|n+n3/2) with multiplicative stretch at most 5. We construct another fault-tolerant sourcewise approximate distance oracle of size O~(|S|n+n4/3) with multiplicative stretch at most 13. Both the oracles have O(1) query answering time.

Keywords and phrases:
Weighted graphs, approximate distances, fault-tolerant data structures
Copyright and License:
[Uncaptioned image] © Dipan Dey and Telikepalli Kavitha; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
Funding:
Supported by the Department of Atomic Energy, Government of India, under project no. RTI4001.
Editors:
C. Aiswarya, Ruta Mehta, and Subhajit Roy

1 Introduction

The problem of computing distances between all pairs of vertices in a given graph G=(V,E) with positive edge weights is a fundamental problem in graph algorithms. The problem here is to preprocess G and build a compact data structure (called a distance oracle) that can quickly answer distance queries for any pair of vertices. As is often the case with real-world networks like routing networks or road networks, links may fail or roads may be temporarily blocked. Thus we have to allow for the case of faulty edges. Since several links are unlikely to fail simultaneously, we consider the case of a single edge failure.

Instead of recomputing distances from scratch for all pairs of vertices after an edge has failed, the problem is to build a resilient data structure that can answer distance queries between vertices after a single edge failure. Furthermore, we assume there is a specific set SV of sources, e.g., S is a set of starting locations on a road network or S is a set of source nodes in a routing network. Thus we are interested in distances only for pairs (s,v)S×V. Suppose |S|n, say O(nϵ) for some ϵ(0,1). Then it feels wasteful to build a fault-tolerant distance oracle that maintains distances for all pairs of vertices. Another application is Vickrey pricing [21], with the objective of determining, for every (s,v)S×V where S is a given subset of V and every edge f, how much the distance from s to v increases if f were to fail.

Thus the distance oracle has to process queries of the form (s,v,f) where s is the source, v is the destination, and f is the failed edge. Upon query Qu(s,v,f), the oracle has to return the s to v distance in Gf, where Gf is the graph obtained by removing edge f from G. So our problem is the following (as described below).

  • Preprocess G and build a compact data structure that can quickly answer distance queries for any pair in S×V when an edge fails. Hence our distance oracle has to answer queries Qu(s,v,f) where sS,vV, and f is the failed edge.

This data structure is called a single edge fault-tolerant sourcewise distance oracle. As discussed below, for undirected unweighted graphs, a single edge fault-tolerant sourcewise exact distance oracle of size O~(n3/2|S|) with O~(1) query time is known [20]. Our goal is to design more compact distance oracles (for sublinear sets S) in weighted graphs. Furthermore, for the sake of space efficiency, we are ready to relax exactness. Thus the problem we consider is to design a compact fault-tolerant sourcewise approximate distance oracle.

Recall that “sourcewise” captures the fact that we are interested in distances between pairs (s,v)S×V. For any pair (s,v)S×V and fE, let svf denote the distance from s to v in Gf. A fault-tolerant approximate distance oracle is said to have multiplicative stretch α if the distance dGf(s,v) returned by the oracle on query Qu(s,v,f) is sandwiched between the actual distance and α times the actual distance, i.e., svfdGf(s,v)αsvf. We show the following result.

Theorem 1.

Let G=(V,E) be an undirected graph on n vertices with positive edge weights. For any SV, a fault-tolerant sourcewise approximate distance oracle with multiplicative stretch at most 5 and size O~(|S|n+n3/2) can be constructed in polynomial time such that Qu(s,v,f) where (s,v)S×V and fE can be answered in constant time.

Note that our oracle has size O~(n3/2) when |S|=O(n). For smaller sets S, we show a sparser fault-tolerant sourcewise approximate distance oracle at the expense of a larger stretch. Its query answering time is also O(1).

Theorem 2.

Let G=(V,E) be an undirected graph on n vertices with positive edge weights. For any SV, a fault-tolerant sourcewise approximate distance oracle with multiplicative stretch at most 13 and size O~(|S|n+n4/3) can be constructed in polynomial time such that Qu(s,v,f) where (s,v)S×V and fE can be answered in constant time.

Thus the above oracle has size O~(n4/3) when |S|=O(n1/3). The work of Bilò, Gualà, Leucci, and Proietti on multiple-edge fault-tolerant approximate shortest path trees [11] in undirected weighted graphs with a single source (so |S|=1) implies a multiple-edge fault-tolerant sourcewise111Unfortunately, we were unaware of this work at the time of paper submission. We thank Manoj Gupta for bringing this paper to our attention. (so S is any subset of V) approximate distance oracle of size O~(|S|n) with a stretch of 3. Thus their oracle is sparser than our oracles when |S| is small and it also achieves a better stretch. However, our query answering time is O(1) while theirs is O(log2n), where n is the number of vertices. Our algorithms are truly simple while their techniques are quite involved.

As mentioned above, the problem of constructing fault-tolerant sourcewise exact distance oracles in undirected unweighted graphs has been studied earlier. Also, in undirected weighted graphs, the problem of constructing fault-tolerant single source (so |S|=1) exact distance oracles has been studied. We discuss these results below.

Background.

The first fault-tolerant exact distance oracle was designed by Demetrescu and Thorup in 2002 [14] and it was for directed weighted graphs. Their oracle handles single edge failures and has size O(n2logn) with O(1) query time. After this result, there has been a long line of research on the problem of efficiently constructing single edge/vertex fault-tolerant exact distance oracles. Ignoring preprocessing time, the most space-efficient oracle is by Duan and Zhang [19] with size O(n2) and query time O(1). Thus it shaves off the logn factor from the size of the original oracle.

Fault-tolerant sourcewise distance oracles. For undirected unweighted graphs, Gupta and Singh [20] designed a single edge fault-tolerant sourcewise exact distance oracle of size O~(n3/2|S|) with O~(1) query time and source set S. In undirected graphs with edge weights in the range {1,2,,M}, Bilò, Cohen, Friedrich and Schirneck [10] designed a fault-tolerant single source exact distance oracle. This oracle handles single edge failures and has size O~(n3/2M) with query time O~(1). For undirected unweighted graphs, Dey and Gupta [15] designed a different oracle with the same space and query time bounds as in [10], but with a faster preprocessing time.

ST-distance oracles in directed graphs. In directed weighted graphs, Bilò, Choudhary, Gualà, Leucci, Parter and Proietti [9] designed a fault-tolerant ST-distance oracle, i.e., it maintains exact distances for all pairs in S×T, for given vertex subsets S and T. It handles single edge failures and has size O~((|S|+|T|)n) with O(1) query time, where n is the number of vertices. They also designed a fault-tolerant ST-distance oracle in unweighted directed graphs of size O~(n|S||T|) with query time O(|S||T|). Furthermore, they showed a fault-tolerant ST-approximate distance oracle in directed unweighted graphs that returns in constant time a distance estimate stretched by an additive term. In particular, when |S|=O(n), their oracle has size O~(n3/2) and additive stretch O~(n).

Fault-tolerant approximate distance oracles.

Approximate distance oracles that provide distances within a small multiplicative stretch for all vertex pairs have been extensively studied. Table 1 summarizes results for fault-tolerant approximate distance oracles in directed/undirected graphs. Note that the stretch here is multiplicative, except for the last row where the stretch has an additive term as well.

Table 1: A table listing the works related to fault-tolerant approximate distance oracles where D is the diameter of the graph.
Graph Faults Stretch Size Query time Ref
Undirected Weighted c1 (8k2)(c+1), k1 integer O(ckn1+1/klog(nM)), M is the max edge wt O~(c) [13]
Undirected Unweighted c=1 (2k1)(1+ϵ), k1 integer and ϵ>0 O~(k5ϵ4n1+1/k) O(1) [2]
Undirected Weighted c=o(lognloglogn) (1+ϵ) O(n2(logD/ϵ)cclogD) O(c5logD) [12]
Directed Unweighted c2 (3+ϵ) O~(n2αc+1/ϵ)(logn/ϵ)c) where α(0,1/2) and ϵ>0 O(nα/ϵ2) [6]
Undirected Weighted c=o(lognloglogn) (2k1) where k1 integer O(n1+1k+α+o(1)) where α(0,1) O(n1+1kαk(c1)) [8]
Undirected Unweighted c=o(lognloglogn) (k+1k)(1+ϵ) with additive stretch of 2, k1 integer and ϵ0 O(n2γ(k+1)(c+1)+o(1)ϵc+2) where γ(0,k+12) O(nγ/ϵ2) [7]

For single edge faults, note that Chechik, Langberg, Peleg, and Roditty [13] showed an approximate distance oracle with stretch 12 and size O~(n3/2) and another with stretch 28 and size O~(n4/3). In comparison to this, Theorem 1 shows a sourcewise approximate distance oracle with stretch 5 and size O~(|S|n+n3/2) and Theorem 2 shows a sourcewise approximate distance oracle with stretch 13 and size O~(|S|n+n4/3). Thus for small sets S, our oracles are as sparse and have smaller stretch. Note that for single faults and every k1, approximate distance oracles by Baswana and Khanna [2] are almost as sparse as the oracles in [13] and have significantly smaller stretch. However these oracles work only for unweighted graphs.

It is an open problem if our construction can be generalized to work for all integers k, in other words, to show a sourcewise approximate distance oracle of size O~(|S|n+n1+1/k) and stretch 8k3 with O(1) query answering time for k3. Our results show such a construction for k=1,2. Note that the remaining approximate distance oracles in Table 1 have superconstant query time, so our oracles cannot directly be compared with them.

Our techniques.

Our algorithms are simple to describe and use the ST-distance oracle by Bilò, Choudhary, Gualà, Leucci, Parter and Proietti [9]. Their oracle uses landmark vertices, i.e., vertices picked uniformly at random from the vertex set V (originally used by Bernstein and Karger [4]).222To the best of our knowledge, the name “landmark” vertices was first used by Dey and Gupta [16]. The oracle in Theorem 1 uses this ST-distance oracle for the given source set S and T=S, where is our landmark vertex set. The oracle in Theorem 2 is based on the same idea, however there are two levels of sampling here: so we have two landmark vertex sets 21. Theorem 1 and Theorem 2 are proved in Section 3 and Section 4, respectively. We discuss preliminaries in Section 2 and conclude in Section 5.

2 Preliminaries

This section describes the notation that will be used in the rest of the paper and also gives a sketch of the ST-distance oracle from [9]. Our input is an undirected graph G=(V,E) with positive edge weights as given by 𝗐𝗍:E+. For any path ρ in G:

  • let ρ be the length of ρ, i.e., ρ=eρ𝗐𝗍(e);

  • let |ρ| be the hop length of ρ, i.e., the number of edges in ρ.

For any (u,v)V×V, a shortest path between u and v is a path of minimum length between u and v. We assume the shortest path between any two vertices in the graph is unique. This property can be achieved by random perturbation of the given edge weights (e.g., see [22]). The property of unique shortest paths was also used in [5, 17, 18, 20, 21]. We denote the shortest path from u to v by uv. Thus uv is the distance between u and v in G and |uv| is the hop length between u and v in G.

  • Let Gf=(V,E{f}) be the graph obtained after deleting edge f from the graph G. As in G, we assume there is a unique shortest path between any pair of vertices in Gf.

  • For any (u,v)V×V and fE, let uvf be the shortest path between u and v in Gf. So uvf is the distance between u and v in Gf.

The concept of landmark vertices will be key to our distance oracles.

Definition 3 (Landmark Vertex Set, ).

Sample each vertex in G independently with probability p. The selected set (call it ) of vertices is the landmark vertex set.

The probability p in Definition 3 will be set to different values in Section 3 and Section 4. The following proposition on the landmark vertex set will be very useful to us.

Proposition 4.

With high probability, for any pair of vertices u and v, if |uv|3lnnp then there is at least one landmark vertex on uv.

Proof.

Since each vertex in G is sampled independently with probability p, for any pair of vertices u and v, the probability that there is no landmark vertex on uv is (1p)k where k=|uv|+1 is the number of vertices on uv. Because |uv|3lnnp, the probability that there is no landmark vertex on uv is at most:

(1p)3lnnp(1e)3lnn1n3.

Thus for any pair of vertices u and v with |uv|3lnnp, the probability that there is no landmark vertex on uv is at most 1/n3. Hence the probability that there is some pair (x,y)V×V with |xy|3lnnp such that there is no landmark vertex on xy is at most (n2)/n31/n. Thus with probability at least 11/n, it is the case that for every pair (u,v)V×V with |uv|3lnnp, there is at least one landmark vertex on uv.

Since each vertex in G is sampled with probability p, the expected size of is np. We will set p=nδ for some δ(0,1) in Section 3 and Section 4. Thus with high probability, we will have ||2np (by Chernoff bound).

  • If either ||>2np or there exists a pair of vertices u,v with |uv|3lnnp such that there is no vertex of on uv then we will repeat the step of sampling vertices and construct another landmark vertex set such that both these properties hold for the set obtained.

Thus we will assume that ||=O(np) and every pair of vertices u,v with |uv|3lnnp has at least one vertex of on uv. The expected number of trials to obtain a desired landmark set is O(1).

Fault-Tolerant 𝑺𝑻-Distance Oracle.

We now briefly discuss the algorithm of Bilò et al.[9] to construct a fault-tolerant exact distance oracle in G for pairs (s,t)S×T, where SV and TV are part of the input. Fix a pair (s,t)S×T and let f=(a,b) be any edge on st. Let and be the two landmark vertices on st closest to a and b on as and bt, respectively.

There are 3 cases with respect to the replacement path stf: (i) stf goes through , (ii) stf goes through , (iii) stf goes through neither nor . Their algorithm builds tables to deal with each of these cases. Figure 1 captures the main idea.

Figure 1: The replacement path stf where f=(a,b) in case (i) is s followed by the magenta path tf; in case (ii) it is the blue path sf followed by t and in case (iii) stf avoids both and - so the orange path is part of stf.

The following theorem from [9] will be used in our algorithms.

Theorem 5 ([9]).

An n-vertex directed or undirected weighted graph G for given subsets S and T of V can be preprocessed in polynomial time to compute a data structure of size O((|S|+|T|)nlogn) that given any pair (s,t)S×T and any failing edge f can report stf in constant time.

3 Sourcewise Approximate Distance Oracle for a Single Edge Fault

Our input is an undirected weighted graph G=(V,E) with a positive weight function 𝗐𝗍:E+ and a subset SV of sources. The goal is to build a compact data structure that can answer distance queries Qu(s,v,f) within a small multiplicative stretch, where sS,vV and f is the edge fault.

Landmark vertex set 𝓛.

Recall Definition 3 on landmark vertices. Let us sample each vertex independently with probability p=(3lnn)/n to obtain our landmark vertex set . The following two properties hold (see Proposition 4); otherwise we resample to obtain another landmark vertex set so that the following two properties hold.

  • ||=O(nlogn).

  • For any pair of vertices u and v: if |uv|n, then there is at least one landmark vertex on uv.

Our algorithm.

On input G=(V,E) and SV, the first step of our algorithm is to build the above landmark vertex set . We then compute shortest path trees 𝒯(u) rooted at u for all uS. Along with every vertex v, the tree 𝒯(u) also has the two attributes uv and |uv|, i.e., the length and the hop length of uv.

Our algorithm constructs the ST-exact distance oracle from [9] fixing the source set S and destination set T=S. For each vV, let tv be the vertex in T that is closest to v, where ties are broken arbitrarily. We maintain the lengths of replacement paths vtvf for each edge fvtv. Our algorithm is described below.

  1. 1.

    Obtain the landmark vertex set .

  2. 2.

    For each uS do: compute the shortest path tree 𝒯(u) rooted at u in G=(V,E).

  3. 3.

    Use Theorem 5 to construct an ST-exact distance oracle for the given source set S and target set T=S in G=(V,E).

  4. 4.

    For every vV in the graph G=(V,E) do:

    • Identify the nearest vertex to v in the target set T=S. Call this vertex tv.

    • For 1i|vtv| do:

      • Let fi be the i-th edge from tv on vtv.

      • Compute the distance vtvfi between v and tv in Gfi.

      • Set 𝖣𝗂𝗌𝗍T[v,i]=vtvfi.

Query answering algorithm.

In response to the query Qu(s,v,f), the query answering algorithm first checks if fsv. This check can be done efficiently via LCA queries. Given a rooted tree 𝒯 and a pair of vertices x,y in the tree 𝒯, recall that 𝖫𝖢𝖠𝒯(x,y) is the least common ancestor of x and y in tree 𝒯.

Observe that f=(a,b)sv if and only if the answer to the following three questions is “yes” where 𝒯(s) is the shortest path tree in G rooted at s.

  • Is 𝖫𝖢𝖠𝒯(s)(v,a) equal to a?

  • Is 𝖫𝖢𝖠𝒯(s)(v,b) equal to b?

  • Is |sa|+1=|sb| or is |sb|+1=|sa|?

A “yes” answer to the first two questions implies that both a and b are vertices on the path sv. Moreover, a and b are adjacent to each other on sv if and only if the answer to the third question is “yes”. Recall that for any vertex w, |sw| is the hop length between s and w, i.e., the number of edges in sw.

Given a tree 𝒯, there is a linear time algorithm to build an O(n) size data structure such that LCA queries on 𝒯 can be answered in O(1) time [3]. Recall that for every vertex w, the hop length |sw| is stored along with w in 𝒯(s). Thus |sa| and |sb| can be retrieved in O(1) time. Hence the query answering algorithm can determine in O(1) time if fsv or not. The query answering algorithm will return sv if fsv (see Figure 2). Recall that the distance sv is also stored along with v in 𝒯(s).

Figure 2: Here f=(a,b)sv, so the path sv is undisturbed by the edge fault f.

If fsv, then the query answering algorithm looks up the identity of tv, which is the nearest vertex in T to v. Via LCA queries on 𝒯(tv), we can determine if f=(a,b)vtv or not. If not, then vtvf=vtv. So let us assume fvtv.

Assume without loss of generality that b is closer than a to tv, i.e., |atv|=|btv|+1 (see Figure 2). Let i be the index such that a is the i-th vertex from tv on vtv. Then vtv(a,b)=𝖣𝗂𝗌𝗍T[v,i]. Recall that the attribute i=|atv| is stored along with a in 𝒯(tv).

  • The query answering algorithm returns vtvf+stvf, where vtvf=𝖣𝗂𝗌𝗍T[v,i].

Note that the distance stvf is obtained by querying the ST-distance oracle. Thus the query answering time is O(1). We show below that vtvf+stvf5svf.

Lemma 6.

For any (s,v)S×V and fE, our algorithm returns an estimate for the s-v distance in Gf with a multiplicative stretch of at most 5 in constant time.

Proof.

It follows from the discussion above that the query answering time is O(1). Since the query answering algorithm will return sv if fsv (see Figure 2), let us assume fsv. Then the distance estimate returned by the query answering algorithm in response to query Qu(s,v,f) is stvf+vtvf where tv is the nearest vertex in T to v. We now bound the sum stvf+vtvf. Consider the following four cases.

  1. 1.

    fstv and fvtv. This means the edge f belongs to the shortest path between tv and the least common ancestor of s and v in 𝒯(tv) (see Figure 2). However then fsv, contradicting our assumption that fsv.

  2. 2.

    fstv and fvtv. The query answering algorithm will determine via LCA queries that fvtv, so vtvf=vtv. Let us bound stvf. The graph Gf has an s-tv path obtained by stitching the paths svf and vtv, i.e., the path svf followed by vtv. So stvfsvf+vtv. Hence the distance returned is at most svf+2vtv.

    • Because tv is the closest vertex in T=S to v, we have vtvvs. Hence the distance returned is at most svf+2sv3svf. So the stretch is at most 3 in this case.

  3. 3.

    fstv and fvtv. Since fstv, we have stvf=stv. Let us bound vtvf. Since the graph Gf has a vtv path obtained by stitching svf and stv, we have vtvfsvf+stv (see Figure 3). Thus the distance returned by the oracle is at most svf+2stv.

    • Observe that stvsv+vtv2sv. Hence the distance returned is at most svf+4sv5svf. Thus the stretch is at most 5 in this case.

    Figure 3: The oracle returns a distance estimate 2stv+svf, so the stretch is 5.
  4. 4.

    fstv and fvtv. The query answering algorithm will return stvf+vtvf=stv+vtv in this case. We have vtvvs and we also have stvsv+vtv2sv. Thus the stretch is at most 3 in this case.

This finishes the proof of the lemma.

Data structures constructed.

Our algorithm constructs in step 3 all the data structures constructed by the ST-distance oracle algorithm. Thus we have access to stf for every (s,t)S×T and fE. Our oracle also has the table 𝖣𝗂𝗌𝗍T that stores vtvf between v and tv in Gf, for each vV and edge fvtv. Let us bound the size of our oracle.

Lemma 7.

The size of the data structures constructed by our algorithm is O~(|S|n+n3/2).

Proof.

The size of T is O~(|S|+n). So the sizes of all the shortest path trees constructed in step 2 is O~(|S|n+n3/2). Similarly, the size of the ST-oracle constructed in step 2 is O~(|S|n+n3/2) (by Theorem 5). Furthermore, the data structure used to answer LCA queries on each shortest path tree 𝒯(u) has size O(n). Since uS, these data structures also take up space O~(|S|n+n3/2).

It follows from the property of our landmark set that for any vertex v, we have min{|v|:}n. So for each vertex v, we have vtvf stored for at most n many edges f, where tv is the nearest vertex in T to v. Thus the size of 𝖣𝗂𝗌𝗍T is O(nn)=O(n3/2). This finishes the proof of the lemma.

It is easy to see that our algorithm runs in expected polynomial time. Recall that we used randomization to construct the set . By blowing up the size of by a factor of logn, the construction of can be made deterministic (see [9, Lemma 1]). Thus Theorem 1 follows. We restate it below for convenience. See 1

 Remark 8.

The query answering algorithm can return not only an approximate estimate of the distance svf, but also the corresponding approximate shortest path in a succinct form. It is known that any replacement path ρf is 2-decomposable, i.e., it is a concatenation of at most 2 shortest paths interleaved with at most 1 edge [1].

So along with any distance stvf (similarly, vtvf), we could also store the corresponding replacement paths in 2-decomposable form. Thus the query answering algorithm can return the corresponding s-v approximate shortest path as the union of two replacement paths ρ1=stvf and ρ2=vtvf, each in 2-decomposable form, say, ρ1=s,x,y,tv and ρ2=v,x,y,tv. This will mean ρ1 is the shortest path in G between s and x followed by the edge (x,y), and the shortest path in G between y and tv, similarly for ρ2.

4 A Sparser Fault-Tolerant Sourcewise Approximate Distance Oracle

In this section, we present another fault-tolerant sourcewise approximate distance oracle for single edge faults. As before, the input is an undirected weighted graph G=(V,E) with a positive weight function 𝗐𝗍:E+ and a subset SV of sources. Our goal is to build a sparser data structure that can answer distance queries Qu(s,v,f) within a small multiplicative stretch. For sets S of size o(n), the oracle in this section will be sparser than the one in Section 3.

Our algorithm.

We will now construct two sets 1 and 2 of landmark vertices. We will first run the sampling step in Section 2 with p=(3lnn)/n1/3. Let 1 be the resulting landmark set. The following properties follow from Section 2 (see Proposition 4).

  • |1|=O(n2/3logn).

  • For any pair of vertices u and v: if |uv|n1/3 then there is at least one vertex of 1 on uv.

After that, we sample each vertex of 1 with probability 1/n1/3. Let 2 be the set of selected vertices. Observe that this 2-step sampling to obtain 2 is equivalent to running the sampling step in Section 2 with p=(3lnn)/n2/3 on the entire vertex set V. Thus the following properties follow from Section 2 (see Proposition 4).

  • |2|=O(n1/3logn).

  • For any pair of vertices u and v: if |uv|n2/3 then there is at least one vertex of 2 on uv.

Rather than sampling each vertex of V with probability p=(3lnn)/n2/3 to get 2, we did this in two steps so that we have 21. Let T1=1S and let T2=2S. Our algorithm will use the following notations for any vertex v.

  • Let tv be the vertex in T1 that is nearest to v.

  • Let tv be the vertex in T2 that is nearest to v.

For each uT2, we will keep the shortest path tree 𝒯(u) in G rooted at u. However, we cannot afford to keep shortest path trees rooted at each u1, since that would exceed the desired space bound. Corresponding to each u1, let 𝖡𝖺𝗅𝗅(u)={vV:tv=u} be the set of all vertices v that regard u as their nearest vertex in T1.

  • For each v𝖡𝖺𝗅𝗅(u), we will store the path uv.

  • Thus we keep a truncated shortest path tree 𝒯^(u)=v𝖡𝖺𝗅𝗅(u)uv in G rooted at u for each u1.

Along with each vertex v𝒯^(u) where u1, we also store |uv|, i.e., the hop length of uv, and the distance uv. Similarly, as done in Section 3, along with each vertex v𝒯(u), where uT2, we store |uv| and uv.

Below we describe the steps in our algorithm.

  1. 1.

    Obtain the landmark sets 1 and 2, where 21, as described above.

  2. 2.

    For each uT2=S2 do: compute the shortest path tree 𝒯(u) rooted at u in G.

  3. 3.

    Use Theorem 5 to construct an ST-exact distance oracle for the given source set S and target set T=T2 in G=(V,E).

  4. 4.

    For each u1 do: compute the truncated shortest path tree 𝒯^(u) rooted at u in G.

  5. 5.

    For every vV do:

    1. (a)

      Let tvT1=S1 be the vertex in T1 that is nearest to v.

    2. (b)

      For 1i|vtv| do:

      • Set 𝖣𝗂𝗌𝗍1[v,i]=vtvfi where fi is the i-th edge from tv on vtv.

  6. 6.

    For every uT1 do:

    1. (a)

      Let tuT2 be the vertex in T2 that is nearest to u.

    2. (b)

      For 1j|utu| do:

      • Set 𝖣𝗂𝗌𝗍2[u,j]=utufj where fj is the j-th edge from tu on utu.

Observe that the array 𝖣𝗂𝗌𝗍1[v,i] stores for any vertex v, the distance vtvf where f is the i-th edge from tv on the path vtv. Similarly, the array 𝖣𝗂𝗌𝗍2[u,j] stores for any vertex uT1, the distance utuf where f is the j-th edge from tu on the path utu.

The query answering algorithm.

In response to the query Qu(s,v,f), the query answering algorithm first checks if fsv. As described in Section 3, this is done by checking the answers to some LCA queries in 𝒯(s). Let us assume fsv, otherwise the query answering algorithm will return sv.

Then the query answering algorithm looks up x=tv and y=tx. In more detail, (i) x is the closest vertex to v in T1 and (ii) y is the closest vertex to x in T2. The query answering algorithm needs to know if fvx or not; if so, it also needs to know the index i{1,,n1/3} such that f is the i-th edge on xv. As described in Section 3, we can decide if fvx or not via LCA queries on the truncated shortest path tree 𝒯^(x). If so, we can also obtain from 𝒯^(x) the value i such that f is the i-th edge from x on xv.

Thus the query answering algorithm knows in O(1) time whether fxv or not and if so, the index i such that f is the i-th edge from x on xv.

  • If fvx then vxf=vx; else vxf=𝖣𝗂𝗌𝗍1[v,i].

Recall that we compute 𝒯(u) for all uT2. Thus, as described in Section 3, we can efficiently check if fxy or not; if so, the algorithm also knows the index j such that f is the j-th edge from y on the path xy.

  • If fxy then xyf=xy; else xyf=𝖣𝗂𝗌𝗍2[x,j].

Since sS and yT2 (recall that T=T2), the distance ysf is obtained by querying the ST-distance oracle. Thus the query answering algorithm can obtain vxf,xyf, and ysf in O(1) time. In response to the query Qu(s,v,f), the query answering algorithm returns vxf+xyf+ysf.

We will show in Lemma 9 that our s-v distance estimate in Gf is at most 13svf.

Lemma 9.

For any (s,v)S×V and fE, our algorithm returns an s-v distance estimate with stretch 13 in Gf in O(1) time.

Proof.

Suppose the query is Qu(s,v,f). If fsv then the algorithm returns sv, thus the stretch is 1 in this case. So assume fsv. Then the query answering algorithm returns vxf+xyf+ysf, where x=tv and y=tx. Let us bound the stretch.

We need to compare the sum vxf+xyf+ysf with svf. Let us first show the following claim.

Claim 10.

We have (i) vxvs, (ii) xy2vs, and (iii) ys4vs.

Proof.

It follows from the definition of T1=1S that both x and s are in T1. Since x is the nearest vertex in T1 to v, we have vxvs. Recall that y=tx. Since y is the closest vertex in T2 to x, we have xyxtv, i.e., the x-y distance is at most the distance between x and tv (recall that tv is the nearest vertex in T2 to v). Furthermore, xtvxv+vtv.

Observe that both xv and vtv are at most sv since sT1T2, so v’s distance to its nearest vertex in T1 and also in T2 is at most sv. Thus xy2sv. So we have yssv+vx+xysv+sv+2sv=4sv.

We are now ready to bound vxf+xyf+ysf. There are 8 cases depending on the presence of edge f on various shortest paths.

  1. 1.

    fvx and fxy and fys. Then the algorithm returns vx+xy+ys.

    • It immediately follows from Claim 10 that the s-v distance estimate returned in this case is at most 7sv7svf.

  2. 2.

    fvx and fxy and fys. Then the algorithm returns vx+xy+ysf. Since the failed edge f belongs to neither vx nor xy, we have syfsvf+vx+xy. We have vx+xy3sv (by Claim 10).

    • Thus the s-v distance estimate returned in this case is at most svf+6sv7svf (by Claim 10).

  3. 3.

    fvx and fxy and fys. Then the algorithm returns vx+xyf+ys. Observe that Gf has an x-y path of length at most xv+vsf+sy. Since ys4vs (by Claim 10), this x-y path in Gf is of length at most sv+svf+4sv=5sv+svf.

    • Using Claim 10 to bound vx and ys, the s-v distance estimate returned in this case is at most sv+5sv+svf+4sv=10sv+svf11svf.

  4. 4.

    fvx and fxy and fys. Then the algorithm returns vxf+xy+ys. Observe that Gf has a v-x path of length at most vsf+sy+yx. This is of length at most svf+4sv+2sv=6sv+svf.

    • Using Claim 10 to bound xy and ys, the s-v distance estimate returned in this case is at most svf+6sv+2sv+4sv=12sv+svf13svf.

  5. 5.

    fvx and fxy and fys. Consider the shortest path tree 𝒯(x) rooted at x in G. Since fvx and fxy, the edge fwy where w=𝖫𝖢𝖠𝒯(x)(v,y). But the edge f also belongs to ys and sv – this is not possible (see Figure 4). Thus this case cannot arise.

    Figure 4: The edge f=(a,b) is supposed to be in the paths sv,xy, and ys, but not in vx. Here w=𝖫𝖢𝖠𝒯(x)(v,y) where 𝒯(x) is the shortest path rooted at x in G.
  6. 6.

    fvx and fxy and fys. Consider the shortest path tree 𝒯(s) rooted at s in G and let z=𝖫𝖢𝖠𝒯(s)(v,x). Since fsv and fvx, it follows that fzv. Hence fsx. Thus there is a v-x path in Gf of length vsf+sx. Since sxsv+vx2sv, this v-x path in Gf has length svf+2sv.

    Since fsv and fsy, the edge fsr where r=𝖫𝖢𝖠𝒯(s)(v,y). Thus the edge f does not belong to the path v-r-y in 𝒯(s). Hence there is an s-y path in Gf of length svf+vs+sysvf+vs+4sv (by Claim 10).

    • Thus Gf has an s-y path of length svf+5vs, plus a v-x path of length svf+2vs. Since xy2vs, the s-v distance estimate returned in this case is at most 2svf+9vs11svf.

  7. 7.

    fvx and fxy and fys. As seen in case 6, there is a v-x path in Gf of length svf+2vs. Moreover, since fvx and fxy, the edge fxw where w=𝖫𝖢𝖠𝒯(x)(v,y). Hence the edge f does not belong to the path v-w-y in 𝒯(x).

    Thus there is a v-y path in Gf of length at most vx+xy3sv. Hence there is an x-v-y path of length at most svf+2vs+3sv=svf+5vs.

    • So the s-v distance estimate returned in this case is at most sy+(svf+5vs)+(svf+2vs). Since sy4sv, this is at most 2svf+11sv13svf.

  8. 8.

    fvx and fxy and fys. As seen in case 7, there is a v-x path in Gf of length svf+2vs and there is an x-y path of length at most svf+5vs. It also follows from case 7 that there is a v-y path in Gf of length at most 3sv, thus there is an s-y path in Gf of length at most svf+3vs.

    • So the s-v distance estimate returned in this case is at most 3svf+10sv13svf.

Thus the stretch of our approximate distance oracle is at most 13. We have already seen that the query answering time is O(1). This finishes the proof of the lemma.

Size of the oracle.

We show below in Lemma 11 that the space taken up by the data structures constructed in all the steps of our algorithm is O~(n4/3+|S|n).

Lemma 11.

The space needed to store all the data structures constructed by our algorithm is O~(n4/3+|S|n).

Proof.

The space taken up by the truncated shortest path trees 𝒯^(u) for all u1 is O(vV|vtv|). Observe that |vtv|n1/3 (by Proposition 4). Thus O(v|vtv|)=O(n4/3). Similarly the space taken up by 𝒯(t) for all tT2 is O(n4/3logn+|S|n) since |T2|=|2|+|S| and |2| is O(n1/3logn). The size of the ST-oracle (where T=T2) is also O~(n4/3+|S|n) (by Theorem 5).

For each vertex v, we store tv and tv – these are the nearest vertices to v in T1 and T2, respectively. For all edges fvtv, we store vtvf in the data structure 𝖣𝗂𝗌𝗍1. We have |vtv|n1/3 (by Proposition 4). Thus the space taken by the data structure 𝖣𝗂𝗌𝗍1 to store the distances 𝖣𝗂𝗌𝗍1[v,i] where vV and 1in1/3 is at most n4/3.

For all edges futu, where uT1, the data structure 𝖣𝗂𝗌𝗍2 stores utuf. For any vertex u, we have |utu|n2/3 (by Proposition 4). Thus the space taken by 𝖣𝗂𝗌𝗍2 to store the distances 𝖣𝗂𝗌𝗍2[u,i] where uT1 and 1in2/3 is |T1|n2/3=O((n2/3logn+|S|)n2/3), which is O(n4/3logn+|S|n2/3). Thus the entire space taken up by all the data structures is O~(n4/3+|S|n).

It is easy to see that our algorithm runs in expected polynomial time. As mentioned at the end of Section 3, by blowing up the sizes of 1 and 2 by a factor of logn, their construction can be made deterministic as stated in [9, Lemma 1]. Moreover, we can easily ensure that L2L1. Thus Theorem 2 follows. We restate it below for convenience. See 2

As mentioned in Remark 8, along with every distance in 𝖣𝗂𝗌𝗍1 and 𝖣𝗂𝗌𝗍2, we could also store the corresponding replacement paths in 2-decomposable form. Thus along with the approximate s-v distance, the query answering algorithm can also return the approximate path between s and v as the union of 3 paths, each in 2-decomposable form.

5 Concluding Remarks

Fault-tolerant approximate distance oracles that maintain approximate distances for all pairs of vertices have been well-studied. Fault-tolerant single source and multiple source exact distance oracles have also been studied. As mentioned in [9], given a subset SV, for the problem of storing svf where (s,v)S×V and fE is allowed333We thank a reviewer for pointing out this subtlety to us. (this is interpreted the same as if no edge has failed), using standard tools, it can be shown that there are n-vertex graph families, for which any representation that allows for the return of all the S×V post-failure distances must have size Ω(n3/2|S|). This motivates the study of sparser data structures that maintain approximate distances for all pairs in S×V under the failure of any fE. Such a data structure is a fault-tolerant sourcewise approximate distance oracle.

We showed two such oracles: one of size O~(|S|n+n3/2) and stretch 5 and another of size O~(|S|n+n4/3) and stretch 13. The query time for both oracles is constant. Upon query Qu(s,v,f) where fE, it turns out that both our query answering algorithms return the original distance sv as if no edge has failed. There are several interesting open problems:

  • Are there approximate sourcewise distance oracles of size O~(|S|n+n1+1/k) and stretch 8k3 with O(1) query answering time for all integers k1? Our constructions showed such oracles for k=1,2.

  • The study of fault-tolerant exact as well as approximate distance oracles has so far considered structured subsets of V×V such as S×T. Is there a sparse fault-tolerant exact or approximate distance oracle for an arbitrary subset 𝒫 of V×V?

References

  • [1] Yehuda Afek, Anat Bremler-Barr, Haim Kaplan, Edith Cohen, and Michael Merritt. Restoration by path concatenation: fast recovery of mpls paths. Distributed Computing, 15(4):273–283, 2002. doi:10.1007/s00446-002-0080-6.
  • [2] Surender Baswana and Neelesh Khanna. Approximate shortest paths avoiding a failed vertex: Near optimal data structures for undirected unweighted graphs. Algorithmica, 66(1):18–50, 2013. doi:10.1007/S00453-012-9621-Y.
  • [3] Michael A. Bender and Martin Farach-Colton. The LCA problem revisited. In Gaston H. Gonnet, Daniel Panario, and Alfredo Viola, editors, LATIN 2000: Theoretical Informatics, 4th Latin American Symposium, Punta del Este, Uruguay, April 10-14, 2000, Proceedings, volume 1776 of Lecture Notes in Computer Science, pages 88–94. Springer, 2000. doi:10.1007/10719839_9.
  • [4] Aaron Bernstein and David R. Karger. Improved distance sensitivity oracles via random sampling. In Shang-Hua Teng, editor, Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, January 20-22, 2008, pages 34–43. SIAM, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347087.
  • [5] Aaron Bernstein and David R. Karger. A nearly optimal oracle for avoiding failed vertices and edges. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 101–110. ACM, 2009. doi:10.1145/1536414.1536431.
  • [6] Davide Bilò, Shiri Chechik, Keerti Choudhary, Sarel Cohen, Tobias Friedrich, Simon Krogmann, and Martin Schirneck. Approximate distance sensitivity oracles in subquadratic space. TheoretiCS, 3, 2024. doi:10.46298/THEORETICS.24.15.
  • [7] Davide Bilò, Shiri Chechik, Keerti Choudhary, Sarel Cohen, Tobias Friedrich, and Martin Schirneck. Improved distance (sensitivity) oracles with subquadratic space. In 65th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2024, Chicago, IL, USA, October 27-30, 2024, pages 1550–1558. IEEE, 2024. doi:10.1109/FOCS61266.2024.00097.
  • [8] Davide Bilò, Keerti Choudhary, Sarel Cohen, Tobias Friedrich, Simon Krogmann, and Martin Schirneck. Compact distance oracles with large sensitivity and low stretch. In Pat Morin and Subhash Suri, editors, Algorithms and Data Structures - 18th International Symposium, WADS 2023, Montreal, QC, Canada, July 31 - August 2, 2023, Proceedings, volume 14079 of Lecture Notes in Computer Science, pages 149–163. Springer, 2023. doi:10.1007/978-3-031-38906-1_11.
  • [9] Davide Bilò, Keerti Choudhary, Luciano Gualà, Stefano Leucci, Merav Parter, and Guido Proietti. Efficient oracles and routing schemes for replacement paths. In Rolf Niedermeier and Brigitte Vallée, editors, 35th Symposium on Theoretical Aspects of Computer Science, STACS 2018, February 28 to March 3, 2018, Caen, France, volume 96 of LIPIcs, pages 13:1–13:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPICS.STACS.2018.13.
  • [10] Davide Bilò, Sarel Cohen, Tobias Friedrich, and Martin Schirneck. Near-optimal deterministic single-source distance sensitivity oracles. In Petra Mutzel, Rasmus Pagh, and Grzegorz Herman, editors, 29th Annual European Symposium on Algorithms, ESA 2021, September 6-8, 2021, Lisbon, Portugal (Virtual Conference), volume 204 of LIPIcs, pages 18:1–18:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ESA.2021.18.
  • [11] Davide Bilò, Luciano Gualà, Stefano Leucci, and Guido Proietti. Multiple-edge-fault-tolerant approximate shortest-path trees. Algorithmica, 84(1):37–59, 2022. URL: https://doi.org/10.1007/s00453-021-00879-8, doi:10.1007/S00453-021-00879-8.
  • [12] Shiri Chechik, Sarel Cohen, Amos Fiat, and Haim Kaplan. (1 + ϵ)-approximate f-sensitive distance oracles. In Philip N. Klein, editor, Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 1479–1496. SIAM, 2017. doi:10.1137/1.9781611974782.96.
  • [13] Shiri Chechik, Michael Langberg, David Peleg, and Liam Roditty. f-sensitivity distance oracles and routing schemes. Algorithmica, 63(4):861–882, 2012. doi:10.1007/S00453-011-9543-0.
  • [14] Camil Demetrescu and Mikkel Thorup. Oracles for distances avoiding a link-failure. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 838–843, 2002. URL: http://dl.acm.org/citation.cfm?id=545381.545490.
  • [15] Dipan Dey and Manoj Gupta. Near optimal algorithm for fault tolerant distance oracle and single source replacement path problem. In Shiri Chechik, Gonzalo Navarro, Eva Rotenberg, and Grzegorz Herman, editors, 30th Annual European Symposium on Algorithms, ESA 2022, September 5-9, 2022, Berlin/Potsdam, Germany, volume 244 of LIPIcs, pages 42:1–42:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ESA.2022.42.
  • [16] Dipan Dey and Manoj Gupta. Near optimal dual fault tolerant distance oracle. In Timothy M. Chan, Johannes Fischer, John Iacono, and Grzegorz Herman, editors, 32nd Annual European Symposium on Algorithms, ESA 2024, September 2-4, 2024, Royal Holloway, London, United Kingdom, volume 308 of LIPIcs, pages 45:1–45:23. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ESA.2024.45.
  • [17] Dipan Dey and Manoj Gupta. Nearly optimal fault tolerant distance oracle. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 944–955. ACM, 2024. doi:10.1145/3618260.3649697.
  • [18] Ran Duan and Hanlin Ren. Maintaining exact distances under multiple edge failures. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 1093–1101. ACM, 2022. doi:10.1145/3519935.3520002.
  • [19] Ran Duan and Tianyi Zhang. Improved distance sensitivity oracles via tree partitioning. In Faith Ellen, Antonina Kolokolova, and Jörg-Rüdiger Sack, editors, Algorithms and Data Structures - 15th International Symposium, WADS 2017, St. John’s, NL, Canada, July 31 - August 2, 2017, Proceedings, volume 10389 of Lecture Notes in Computer Science, pages 349–360. Springer, 2017. doi:10.1007/978-3-319-62127-2_30.
  • [20] Manoj Gupta and Aditi Singh. Generic single edge fault tolerant exact distance oracle. In Ioannis Chatzigiannakis, Christos Kaklamanis, Dániel Marx, and Donald Sannella, editors, 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, volume 107 of LIPIcs, pages 72:1–72:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPICS.ICALP.2018.72.
  • [21] John Hershberger and Subhash Suri. Vickrey prices and shortest paths: What is an edge worth? In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 252–259. IEEE Computer Society, 2001. doi:10.1109/SFCS.2001.959899.
  • [22] Merav Parter and David Peleg. Sparse fault-tolerant BFS trees. In Hans L. Bodlaender and Giuseppe F. Italiano, editors, Algorithms - ESA 2013 - 21st Annual European Symposium, Sophia Antipolis, France, September 2-4, 2013. Proceedings, volume 8125 of Lecture Notes in Computer Science, pages 779–790. Springer, 2013. doi:10.1007/978-3-642-40450-4_66.