How fast can we reach a target vertex in stochastic temporal graphs?

Temporal graphs are used to abstractly model real-life networks that are inherently dynamic in nature. Given a static underlying graph $G=(V,E)$, a temporal graph on $G$ is a sequence of snapshots $G_t$, one for each time step $t\geq 1$. In this paper we study stochastic temporal graphs, i.e. stochastic processes $\mathcal{G}$ whose random variables are the snapshots of a temporal graph on $G$. A natural feature observed in various real-life scenarios is a memory effect in the appearance probabilities of particular edges; i.e. the probability an edge $e\in E$ appears at time step $t$ depends on its appearance (or absence) at the previous $k$ steps. In this paper we study the hierarchy of models memory-$k$, addressing this memory effect in an edge-centric network evolution: every edge of $G$ has its own independent probability distribution for its appearance over time. Clearly, for every $k\geq 1$, memory-$(k-1)$ is a special case of memory-$k$. We make a clear distinction between the values $k=0$ ("no memory") and $k\geq 1$ ("some memory"), as in some cases these models exhibit a fundamentally different computational behavior, as our results indicate. For every $k\geq 0$ we investigate the complexity of two naturally related, but fundamentally different, temporal path (journey) problems: MINIMUM ARRIVAL and BEST POLICY. In the first problem we are looking for the expected arrival time of a foremost journey between two designated vertices $s,y$. In the second one we are looking for the arrival time of the best policy for actually choosing a particular $s$-$y$ journey. We present a detailed investigation of the computational landscape of both problems for the different values of memory $k$. Among other results we prove that, surprisingly, MINIMUM ARRIVAL is strictly harder than BEST POLICY; in fact, for $k=0$, MINIMUM ARRIVAL is #P-hard while BEST POLICY is solvable in $O(n^2)$ time.


Introduction
Dynamic network analysis, i.e. analysis of networks that change over time, is currently one of the most active topics of research in network science and theory. A common task in this field is to use our prior knowledge of the network link dynamics to answer questions about the behavior of the network over time, e.g. how quickly information can flow through it. Many modern real-life networks are dynamic in nature, in the sense that the network structure undergoes discrete changes over time [30,36]. Here we deal with the discrete-time dynamicity of the network links (edges) over a fixed set of nodes (vertices). That is, given an underlying static graph G, the network evolution over G is given by the successive appearance or absence of each edge of G at every time step t = 1, 2, . . .. This concept of dynamic network evolution is given by temporal graphs [26,28], which are also known by other names such as evolving A natural feature of stochastic temporal graphs which can be observed in various real-life scenarios (and which we address in this paper) is that the appearance probability of a particular edge at a given time step t depends on the appearance (or absence) of the same edge at the previous k ≥ 1 time steps. This "memory effect" can often be observed, among others, in faulty network communication and in mobile, social, and peer-to-peer networks [14,34,37]. Several other models of temporal networks which exhibit some sort of probabilistic behavior have been considered in the past, see e.g. [24].
In this paper, we study a hierarchy of models for stochastic temporal graphs. These models concern an edge-centric network evolution, i.e. they assign to every edge of the underlying graph G a probability distribution for its appearance over time, independently of all the other edges. The first and most basic model (memoryless or memory-0) assigns independently to every edge e a probability p e such that, at every time step, e appears with probability p e . In the general model (memory-k), at every time step the appearance probability of every edge is a function of the history of its appearances/absences in the last k ≥ 1 time steps. Clearly, for every k ≥ 1, the memory-(k − 1) model is a special case of the memory-k model. However, in this paper we make a clear distinction between the values k = 0 ("no memory") and k ≥ 1 ("some memory"), as in some cases these models exhibit a fundamentally different computational behavior for these values of k, as our results indicate (see Section 4).
Our memory-k model, k ≥ 1, is a direct generalization of the homogeneous version of the memory-1 model that was introduced in a seminal paper by Clementi et al. [15], in which all edges have the same probability distribution for their appearance, based on their own appearance/absence at the previous step. In this homogeneous memory-1 model, Clementi et al. gave upper bounds for the flooding time and they provided tight characterizations of the graphs on which the flooding time is constant [15]. It is worth noting here that Avin et al. [6] studied the completely opposite extreme of our edge-centric evolution; namely they considered a graph-centric evolution model where a global probability distribution assigns specific transition probabilities among different snapshots [6]. Between the two extremes of the edge-centric and the graph-centric network evolution models, there exists a whole hierarchy of locally interdependent probabilistic patterns, i.e. probability distributions where the appearance probability of one edge also depends on the appearance of other edges over time; such models remain mostly unexplored.
In both our memoryless and memory-k variations of stochastic temporal graphs, we study two fundamental temporal path (i.e. journey) problems that are defined on two designated vertices s and y. Consider a piece of information that is generated at s at time 1, which we would like to send to y via an s-y journey. The arrival time of an s-y journey in a realization of a stochastic temporal graph is the time the information reaches y using this journey. A foremost s-y journey is one with the smallest arrival time. In the first part of the paper we investigate the complexity of computing the expected arrival time of a foremost s-y journey. Basu et al. [8] and Nain et al. [31] studied a similar problem but their work is restricted to the simpler cases where the underlying graph is either a path or a grid.
In the second part of the paper we investigate the complexity of computing the arrival time of a best policy for actually choosing a particular s-y journey in the stochastic temporal graph. To illustrate this notion of a best policy, assume that some piece of information is carried by an entity, say Alice. Alice is given as input the parameters of the stochastic temporal graph (i.e. the probabilistic rules on the edges) and, at every time step, she knows the current snapshot and her current location. Based on this information, Alice has to decide at every step for her next action, while her goal is to reach y as quickly as possible on expectation, starting at time 1. In a very inspiring paper, Basu et al. [7] consider this problem in the special case of the memoryless model where all edges have the same probability of appearance at every time, and give a Dijkstra-like polynomial-time algorithm. Special cases of the memory-1 model were considered in [10].
To illustrate the difference between the two problems we study, we make the following analogy. In the first problem (Minimum Arrival) we try to transfer information from s to y using an unbounded number of messages, i.e. we "flood" the stochastic temporal graph with information. Initially the information is stored at s at time 1 and then, at every step, every informed vertex informs all its neighbors as soon as the edge between them becomes available. In the second problem (Best Policy) we try to transfer a package with a tangible good from s to y. Now, at every step we need to decide for the actual route of the package through the network: when an edge appears, should we ship the package along it or rather wait where we currently are? Best Policy is more relevant to real-life applications than Minimum Arrival, where an actual good journey needs to be found in real time.
Our contribution. In the first part of the paper, in Section 3, we provide our results for the problem Minimum Arrival, i.e. for computing the expected arrival time of a foremost s-y journey in a stochastic temporal graph. First we prove in Section 3.1 that Minimum Arrival is #P-hard even for the memoryless model (and thus also for the memory-k model, for every k ≥ 1). The reduction is done from the problem #PP2DNF which counts the number of satisfying assignments in a positive partitioned 2-DNF Boolean formula [35].
Second, we provide in Section 3.2 a non-trivial approximation scheme for Minimum Arrival, based on dynamic programming, for the memoryless model in the case where the underlying graph G is a series-parallel graph. More specifically, it turns out that this is a Fully Polynomial-Time Approximation Scheme (FPTAS) whenever the probabilities p e are lower bounded by 1 n c for some c ≥ 1. Let X be the random variable that expresses the arrival time of a foremost s-y journey. For every ε ∈ (0, 1], our FPTAS gives an algorithm that produces a value µ where E(X) − ε ≤ µ ≤ E(X), and runs in polynomial time in both n and 1 ε . Although our main result of Section 3.2 concerns series-parallel graphs, we actually present a more general FPTAS approach (see Theorem 3) which is of independent interest and could lead to FPTASs also for more general classes of underlying graphs G.
Third, we present in Section 3.3 a Fully Polynomial Randomized Approximation Scheme (FPRAS) for Minimum Arrival in the memory-k model, for every k ≥ 0, under the assumption that every edge appearance probability is lower bounded by 1 n c for some c ≥ 1. Let X be the random variable that expresses the arrival time of a foremost s-y journey. For every ε ∈ (0, 1), our FPRAS gives a randomized algorithm that produces an estimate X where (1 − ε)E(X) ≤ X ≤ (1 + ε)E(X) with probability tending to 1 as n → ∞, and runs in polynomial time in both n and 1 ε . In the second part of the paper, in Section 4, we provide our results for the problem Best Policy, i.e. for computing the expected arrival time of a best policy for choosing a particular s-y journey. Initially we provide in Section 4.1 a dynamic programming algorithm for the memoryless model which runs in O(n 2 ) time and space. In wide contrast, we prove in Section 4.2 that Best Policy becomes #P-hard for the memory-k model, where k ≥ 3, again by providing a reduction from the problem #PP2DNF. Finally, we provide in Section 4.3 a formulation of Best Policy in the memory-k model using the general Markov Decision Process (MDP) framework which allows us to devise in Section 4.3.2 an exact doubly exponential-time algorithm with running time O(2 (kmn+n log n)·2 km ).

Preliminaries
In this paper we consider temporal graphs (see Definition 1) in which the underlying (static) graph G = (V, E) has n vertices and m edges.
For simplicity of notation we denote [n] = {1, 2, . . . , n} for every n ∈ N. Furthermore, sometimes we refer to the discrete time steps t = 1, 2, . . . as days. Throughout the paper we consider stochastic temporal graphs that exhibit an edge-centric evolution, i.e. every edge e of G is assigned one probability distribution for its appearance over time, independently of all other edges. We investigate the case where there is a "memory effect" that governs the probability of appearance of every edge over time. We distinguish now the cases where the the memory is zero or non-zero.
Memoryless (or memory-0) model. Every edge e ∈ E evolves stochastically and independently of other edges as follows: at every time step t ∈ N, e appears in G t with probability p e and is absent with probability 1 − p e , independently of any other time step. The numbers {p e : e ∈ E} are given parameters of the model. We denote this (memoryless) stochastic temporal graph by Memory-k model. This model of temporal graphs exhibits stochastic time-dependency of the edges: we assume an initial (arbitrary) sequence of k snapshots, G −k+1 , . . . , G −1 , G 0 ⊆ G. At every time step t ≥ 1, every edge e appears independently of all other edges with probability that depends only on (the edge and) the history of appearance of e in the k previous snapshots. At every time step t, this history is a k-bit binary vector, where a 0-entry (resp. 1-entry) on the i-th position denotes absence (resp. appearance) of e in E t−k+i−1 , for i = 1, . . . , k. Therefore the snapshot G t is the graph that appears at time t ≥ 1 as the result of the following experiment: given the history H (k) e of the appearance of edge e ∈ E in the last k snapshots, e belongs to E t independently with probability p e (H (k) e ). We denote the memory-k stochastic temporal graph by G (k) .
In the particular case where k = 1, the memory-1 stochastic temporal graph G (1) is the sequence {G t = (V, E t ) : t ∈ N} of snapshots such that E t = {e ∈ E : X e t = 1}, where {X e t } t∈N is a Markov chain for the edge e ∈ E with states {0, 1} (corresponding to non-appearance and appearance of e, respectively) and probability transition matrix: Using this formalism, p e (resp. q e ) is the probability that the edge e changes its current state from absence to appearance (resp. from appearance to absence) in the next snapshot. Note here that, setting p e = p and q e = q for every edge e, we obtain exactly the well-established edge-Markovian evolving graph model introduced by Clementi et al. [15].

The problems
This work studies two main problems, each under the models of stochastic temporal graphs defined above. To describe both of these problems, let us first recall that information in temporal graphs flows via journeys, i.e. temporal paths.

Definition 3 (Time-edge).
A time-edge in a temporal graph G = {G t : t ∈ N} is a pair (e, t) such that e ∈ E t .
Definition 4 (Journey / temporal path). Let G = {G t : t ∈ N} be a temporal graph and s, y be two vertices of G. An s-y journey (or an s-y temporal path) in G is a sequence (e 1 , t 1 ), . . . , (e x , t x ) of time-edges over a path (e 1 , . . . , e x ) in G, where t 1 < t 2 < . . . < t x . The arrival time of the journey is the time t x of appearance of its last edge.
Definition 5 (Foremost Journey). A foremost s-y journey in a temporal graph G is an s-y journey with minimum arrival time amongst all s-y journeys in G.
Notice that the arrival time of a foremost s-y journey in a stochastic temporal graph is a random variable, which we henceforth denote by X(s, y). The first problem that we study here is how to compute the expected value of the latter, namely E[X(s, y)].
Problem 1 (Minimum Arrival). Given a stochastic temporal graph on an underlying graph G = (V, E) and two distinct vertices s, y ∈ V , compute the expected value of the arrival time of a foremost s-y journey, i.e. E[X(s, y)].
Now suppose that an individual (say Alice) is at day 0 at vertex s and would like to arrive at vertex y through a temporal path as quickly as possible. Denote by s t the vertex where she is located at time t; then s 0 = s. Every day t Alice "wakes up" in the morning and looks at which edges are available in today's snapshot; by only knowing her current position, the history of the last k snapshots, and the input parameters of the stochastic temporal graph (i.e. the probabilistic rules of edge appearance), Alice needs to decide whether: (i) to stay at the vertex s t she currently is, or (ii) to use an edge of G t to move to a neighboring vertex.
That is, s t+1 is either equal to s t or equal to some vertex of Γ Gt (s t ).
A natural problem we can study here is to compute the expected arrival time of an s-y journey that Alice can follow, using a best policy 1 possible, i.e. a policy (sequence of actions) that minimizes her expected arrival time at y. Notice that the arrival time of the journey suggested to Alice by the best policy is a random variable Y (s, y), whose distribution depends on the specific stochastic temporal graph. In particular, in the memoryless model, the expectation of Y (s, y) depends only on the edges' probabilities of appearance. In the memory-k model, the expectation of Y (s, y) also depends on the initial snapshots G −k+1 , . . . , G −1 , G 0 . Problem 2 (Best Policy). Given a stochastic temporal graph G (k) on an underlying graph G = (V, E) and two distinct vertices s, y ∈ V , compute E G (k) [Y (s, y)].
In particular, we will write h(s, y) Difference between the two problems. Before we proceed further, we first give an example illustrating that the problems Minimum Arrival and Best Policy are different. To demonstrate this, assume the memoryless model G (0) and consider the 4-cycle a, b, c, d, a as the underlying graph. Let s = a and y = c and assume that, at any time step, each edge appears independently with probability 1 2 . Any best policy for Alice will wait until an edge incident to a appears and then cross it; if both adjacent edges (a, b) and (a, d) appear at the same time, then it does not matter which one she chooses. The event "some edge adjacent to a appears" occurs with probability 3 4 , hence, the expected time until such an edge appears is 4 3 . Furthermore, when Alice reaches one of the vertices b or d, an optimal policy will never suggest going back to a, so Alice will have to wait until the last edge to c appears, which takes 2 steps on expectation. Overall, the optimal policy for Alice will take h(a, c) = 10 3 steps on expectation. This is the solution to Best Policy (see Problem 2).
On the other hand, Minimum Arrival (see Problem 1) asks for the expectation of the arrival time X(a, c) of a foremost s-y journey. To compute E[X(a, c)], denote by T b (resp. T d ) the arrival time of a journey allowed to use only edges (a, b) and (b, c) (resp. (a, d) and (d, c)), when they appear. Then, But the probability of the event {T b > k} is equal to the probability that either (a, b) does not appear until (and including) step k plus the probability that it appears within the first k steps, and (b, c) does not appear after that until (and including) k. Therefore, and, by independence, for any k ≥ 2: , which is strictly smaller than 10 3 . In fact, the gap between the solution to Minimum Arrival and the solution to Best Policy can be arbitrarily large: Consider the graph consisting of vertices s and y and n − 2 vertex disjoint paths of length 2 between s and y. Assume also that, under the memoryless model, every edge incident to s appears each day with probability 1 and every edge incident to y appears each day independently with probability n −0.9 . Similarly to the above example, the expected arrival time of a best policy for Alice is h(s, y) = 1 + n 0.9 . On the other hand, the arrival time of the foremost journey from s to y will be equal to the first day after day 1 on which some edge incident to y appears. But the time needed for the latter to happen follows the geometric distribution with success probability 1 − (1 − n −0.9 ) n−2 = 1 − o(1). Therefore, the expected arrival time of the foremost journey will be E[X(s, y)] = 2 + o(1), i.e. much smaller than h(s, y) = 1 + n 0.9 .
As a final note, the expected arrival time E[X(s, y)] of the foremost s-y journey is always upperbounded by the minimum among the expected values of the arrival times of all s-y journeys in the temporal graph. This is actually implied by a more general and well-known lemma in Probability Theory (Fatou's lemma [16, p. 29]) which establishes that the expected value of the minimum among n random variables is upper-bounded by the minimum among all the variables' expectations.
3 Computing the expected minimum arrival time 3

.1 Hardness of exact computation in the memoryless model
In this section we show that, even in the memoryless model, Minimum Arrival is #P-hard in both undirected graphs and directed acyclic graphs (DAGs). In the proof of the following theorem, the edges can be treated either as oriented, in which case we obtain the result for DAGs, or as non-oriented, in which case we obtain the result for undirected graphs.
Proof. To prove the theorem we will provide a reduction from the #P-complete problem #PP2DNF [35]. The latter problem is defined as follows. Let X = {x 1 , x 2 , . . . , x n } and Y = {y 1 , y 2 , . . . , y m } be two disjoint sets of Boolean variables. A positive, partitioned 2-DNF formula is a DNF formula of the form: . Given a positive, partitioned 2-DNF formula Φ, the problem #PP2DNF asks for the number of truth assignments satisfying Φ. Let Φ be an instance of #PP2DNF. We define G to be a graph with the vertex set {s, y} ∪ X ∪ Y and the edge set {(s, First we claim 2 that the number ψ of satisfying assignments of Φ is equal to the number of spanning subgraphs of G which contain all the edges from {(x i , y j ) | (i, j) ∈ E} and have a simple path from s to y of length 3. To see the claim, for every subset S ⊆ {(s, we define a truth assignment α that assigns x i = 1 iff (s, x i ) ∈ S and y j = 1 iff (y j , y) ∈ S. Notice that every s-y path of length 3 in G is of the form (s, x i , y j , y) for some (i, j) ∈ E. Therefore, if the subgraph spanned by S contains a path (s, x i , y j , y), then α assigns 1 to both x i and y j , and hence α satisfies Φ. Conversely, given an assignment α satisfying Φ, we define a subgraph of G spanned by the Since α is satisfying assignment, there exists (i, j) ∈ E such α assigns 1 to both x i and y j , and therefore the subgraph contains the s-y path (s, x i , y j , y) of length 3. Now we define an instance of Minimum Arrival in the memoryless model as follows. Let H be the graph obtained from G by adding three new vertices v 1 , v 2 , v 3 and four new edges we set p e = 1/2, and for any other edge e of H we set p e = 1. In this stochastic temporal graph the duration of a foremost journey from s to y is either 3, if for some (i, j) ∈ E the edge (s, x i ) appears in time slot 1, and the edge (y j , y) appears in time slot 3, or 4 otherwise. In other words, the duration of a foremost s-y journey depends only on the subgraph of G spanned by the edge set R 1 ⊆ {(s, x i ) | x i ∈ X} that appears in slot 1, and by the edge set R 3 ⊆ {(y i , y) | y j ∈ Y } that appears in slot 3. The duration is equal to 3 if and only if the subgraph of G spanned by R 1 ∪ {(x i , y j ) | (i, j) ∈ E} ∪ R 3 has an s-y path of length 3. Since every edge in R 1 ∪ R 3 appears independently with probability 1/2, it follows that the probability that this subgraph has a path of length 3 is equal to p = ψ 2 n+m . Consequently, and hence ψ = 2 n+m (4 − E[X(s, y)]). Therefore, knowing the expected duration E[X(s, y)] of an s-y foremost journey, we can efficiently compute the number of satisfying assignments of Φ, which proves that the computation of E[X(s, y)] is #P-hard.
Corollary 1. For every k ≥ 0, Minimum Arrival in the memory-k model is #P-hard.

The case of paths
In this section we will consider a stochastic temporal graph P (0) = (P = (V, E), {p e }) with the underlying graph being a path P = (s = v 0 , v 2 , . . . , v n = y).
Consider a stochastic temporal graph with a single edge e which appears every day independently with probability p e , and let X e be a random variable equal to the duration of the foremost journey from one of the endpoints of e to the other. Then X P (0) (s, y) = e∈E X e . Notice that X e is a geometric random variable with probability mass function Pr[X e = i] = (1 − p e ) i−1 p e for i = 1, 2, 3, ..., and expectation E[ Let us denote by µ the expectation µ In the remainder of this section we will show that the first O(µ ln µ) terms of sum (1) already give a very good approximation of µ. In our analysis we will use the following bound.
. . , n, are independent geometric random variables with parameters p 1 , p 2 , . . . , p n ∈ (0, 1], respectively. Proof. The equality in (2) follows from (1). In the rest of the proof we show the inequality. Since where we used the inequality e x ≥ 1 + x + x 2 /2 which holds for every x ≥ 0.

A general FPTAS approach
While deriving analytically and computing efficiently the exact solution of Minimum Arrival in a path is an easy task (cf. Lemma 1), it does not seem to be trivial for a slight generalization of paths, called parallel compositions of paths. A parallel composition of paths is the graph obtained from a collection of disjoint paths P 1 , P 2 , . . . , P with end vertices s i , y i , i = 1, . . . , , respectively, by identifying the vertices s 1 , s 2 , . . . , s in a single vertex s, and by identifying the vertices y 1 , y 2 , . . . , y in a single vertex y.
It is not clear whether there exists an efficient procedure for computing the expected arrival time from s to y in a parallel composition of paths, even if the parallel paths are of equal length and all the probabilities of edge appearance are the same. In this section we present a general approach for developing ε-additive approximation algorithms 3 for computing the expected arrival time of a foremost journey in special classes of stochastic temporal graphs. In Section 3.2.3 we apply this approach to develop an efficient ε-additive approximation algorithm for the problem on the class of stochastic temporal graphs with underlying graphs being series-parallel graphs, which generalize parallel compositions of paths and graphs, in which all simple s-y paths are of the same length.
Throughout the section we denote by G (0) = (G = (V, E), {p e }) a memoryless stochastic temporal graph with n vertices and m edges, and by s, y ∈ V two distinct vertices in G. Furthermore, we denote by H = (V, E, w) the weighted graph obtained from the underlying graph G by assigning to every edge e ∈ E the weight w(e) = 1 pe .
Definition 6. Let G (0) be a memoryless stochastic temporal graph, where G is the underlying graph. A stochastic temporal subgraph H (0) of G (0) is a stochastic temporal graph which has a subgraph H ⊆ G as an underlying graph and inherits all edge appearance probabilities from G (0) .

Observation 1.
Let H (0) be a stochastic temporal subgraph of the stochastic temporal graph G (0) . Then for every natural number i we have The following lemma is a direct consequence of Observation 1 and Lemma 1. Consequently, if f ( , n, m) is a polynomial in variables , n, and m, then B is an FPTAS on the instance (G (0) , s, y).
Proof. Let P = (s = v 0 , v 1 , . . . , v r = y) be a minimum weight s-y path in H, and let P (0) be the stochastic temporal subgraph of G (0) restricted to the vertices of P . For convenience, let us denote e i = v i−1 v i for every i = 1, . . . , r. Then, by definition and Lemma 1, the weight w * of P is equal to Let τ := w * ln w * ε + 1 . Then, by Observation 1 and Lemma 2, we have that y)] within the additive factor of ε. Now we define the desired algorithm B as follows: 1. Construct the graph H and compute the minimum weight w * of an s-y path in H using Dijkstra's algorithm.
2. Using algorithm A, compute the probabilities Pr[X G (0) (s, y) ≥ i] for every i = 1, . . . , τ , where The above discussion implies that algorithm B correctly computes the declared approximation of E[X G (0) (s, y)]. It remains to justify the time complexity. First, Dijkstra's algorithm can be implemented to work in time O(n ln n + m) [21]. Second, the assumption on p e 's implies that w * = O(n c+1 ), and hence τ = w * ln w * ε + 1 = O n c+1 ln n ε . Therefore the assumption of the theorem implies that the last two steps of the algorithm run in time O f n c+1 ln n ε , n, m , which in turn implies the complexity bound and completes the proof.

The FPTAS for stochastic temporal series-parallel graphs
In the present section we use the approach from Section 3.2.2 to derive a polynomial-time approximation scheme for stochastic temporal series-parallel graphs. According to Theorem 3, the development of such an algorithm reduces to the design of an efficient procedure of computing probabilities of the form Pr[X G (0) (s, y) ≥ i], which is the main goal of this section.
Let G be a graph and s and y be two distinct vertices in G. The triple (G, s, y) is a two-terminal series-parallel graph, with terminals s and y, if G can be constructed by a sequence of the following two operations starting from a set of copies of a single-edge two-terminal series-parallel graph (K 2 , a, b).
Finally, a graph G is called series-parallel if (G, s, y) is a two-terminal series-parallel graph for some pair of distinct vertices s and y of G.
The sequence of parallel and series compositions leading to a two-terminal series-parallel graph (G = (V, E), s, y) can be conveniently represented by a decomposition tree. A binary tree T = (V T , E T ) with a labeling function σ : V T → {s,p} ∪ E × {0, 1} is called a decomposition tree of a two-terminal seriesparallel graph (G, s, y) if and only if the leaves of T are labeled with elements of E × {0, 1} such that every e ∈ E appears in exactly one label, internal nodes are labeled with p or s, and G can be generated recursively using T as follows: If T is a single node v with σ(v) = (e, α), then G consists of the single edge e with the source being the vertex with the smallest ID, if α = 0, and with the source being the vertex with the largest ID, if α = 1. Otherwise, let T 1 (resp. T 2 ) be the right (resp. left) subtree of T and (H 1 , s 1 , y 1 ) and (H 2 , s 2 , y 2 ) be two-terminal series-parallel graphs with decomposition trees T 1 and T 2 : if σ(v) = p (resp. s) then G is the parallel (resp. series) composition of (H 1 , s 1 , y 1 ) and (H 2 , s 2 , y 2 ).
We will make use of tree decompositions of series-parallel graphs in our algorithm. It is known that a tree decomposition of a series-parallel graph can be constructed in linear time. Let G (0) = (G = (V, E), {p e }) be a stochastic temporal graph with the underlying graph G being series-parallel. Let also s, y ∈ V be two distinct vertices in G such that (G, s, y) is a two-terminal seriesparallel graph. We will present a dynamic programming algorithm which, for a given natural number , computes the set of probabilities: For convenience, the algorithm will also support the set of probabilities: Notice that having computed one of the sets of probabilities, the other set can be computed in O( 2 ) time.
The algorithm is based on the following recursive equations. Since (G, s, y) is a two-terminal seriesparallel graph, it is either a single-edge graph, or can be obtained from smaller two-terminal series-parallel graphs (H 1 , s 1 , y 1 ), (H 2 , s 2 , y 2 ) via one of the two composition operations.
1. In the case of a single-edge graph we have for every i ∈ [ − 1] that: where p is the probability of appearance of the unique edge of the graph.
2. In the case of parallel composition we have for every i ∈ [ ] that: where H 3. In the case of series composition, we have for every i ∈ [ − 1] that: Algorithm 1 Compute SP probabilities if (G, s, y) is the parallel composition of (H 1 , s 1 , y 1 ) and (H 2 , s 2 , y 2 ) then 10: for i = 1 to do 11: if (G, s, y) is the series composition of (H 1 , s 1 , y 1 ) and (H 2 , s 2 , y 2 ) then 14: for i = 1 to − 1 do Proof. We start with the analysis of the correctness of the algorithm. First, if the underlying graph of the input stochastic temporal graph is a single-edge graph, then the algorithm computes the required sets of probabilities in lines 2-4 using equations (5). Second, if the underlying graph is not a single-edge graph, then, by definition, (G, s, y) is either the parallel or the series composition of two two-terminal seriesparallel graphs (H 1 , s 1 , y 1 ) and (H 2 , s 2 , y 2 ) whose decomposition trees are the subtrees T H1 and T H2 of T G rooted at the children of the root of T G . In the case of parallel composition, the algorithm computes the sets of probabilities in lines 10-12 using equations (6). In the case of series composition, the algorithm computes the sets of probabilities in lines 14-16 using equations (7). In both cases, the computation of the probabilities uses only the corresponding sets of probabilities for the stochastic temporal subgraphs H In order to analyze the complexity of Algorithm 1, we observe that for every node of the decomposition tree of the underlying graph the algorithm makes exactly one recursive call. In each of the calls, the algorithm executes either lines 2-4, or lines 10-12, or lines 14-16. It is easy to check that each of these sets of lines performs O( 2 ) operations of addition or multiplication. Since T G is a binary tree and has exactly m leaves, in total T G has 2m − 1 nodes, and therefore the total time complexity of Algorithm 1 is O(m 2 ).
Finally we present an FPTAS for the expected arrival time of a foremost s-y journey in a stochastic temporal series-parallel graph.

Algorithm 2 FPTAS for Minimum Arrival in stochastic temporal series-parallel graphs
Input: A stochastic temporal series-parallel graph G (0) = (G = (V, E), {p e }) such that p e ≥ 1 n c for every e ∈ E, and a number ε ∈ (0, 1]. 1: Let H = (V, E, w) be the weighted graph obtained from the underlying graph G by assigning to every edge e ∈ E the weight w(e) = 1 pe 2: Compute the minimum weight w * of an s-y path in H 3: Let τ = w * ln w * ε + 1 4: Compute a tree decomposition T of (G, s, y) 5: Compute SP probabilities(G (0) , s, y , T, τ ) The following theorem follows from the proof of Theorem 3 and Theorem 5.

The FPRAS for general graphs in the memory-k model, k ≥ 0
In this section, we present our FPRAS for Minimum Arrival in the memory-k model, for every k ≥ 0, under the assumption that every edge appearance probability is lower bounded by 1 n c for some c ≥ 1. Proof. Let e ∈ E be an arbitrary edge in the underlying graph and let t ∈ N be an arbitrary time point. Let X t e be the random variable equal to the shortest time required to move from one of the end vertices of e to the other starting at time t. Then for any a ∈ N we have that Pr[e does not appear at time point (1 − Pr[e appears at time point t + i]) where the inequality follows from our assumed lower bound on edge appearance probabilities. Now, since an edge and time point were picked arbitrarily and any s-y foremost journey traverses at most n − 1 edges, we conclude (1), that is, Pr [X(s, y) ≥ a] ≤ Pr [(n − 1)Z ≥ a], for every a ∈ N. Now, using (1) we establish (2) as follows: In the following theorem we provide our FPRAS for Minimum Arrival.
Theorem 7. Let ε ∈ (0, 1) and let G (k) be a memory-k stochastic temporal graph with two designated vertices s, y. Furthermore let every edge appearance probability be at least 1 n c for some c ≥ 1. Then Minimum Arrival admits an FPRAS which runs in O m n 5c+8 ε 4 · log( n ε ) time with probability of success at least 1 − 2 n . Proof. Let G (k) be a stochastic temporal graph with two designated vertices s, y. Furthermore let X, as before, be the arrival time of a foremost s-y journey. We will estimate the expectation E(X) via an unbiased estimator approach. We perform r times independently the following experiment Exp; for now let us assume an arbitrary value for r, to be chosen precisely later.

Algorithm 3 * Experiment Exp
Input: A stochastic temporal graph G (k) on an underlying graph G with n vertices and m edges and two designated vertices s, y of G 1: Starting at time t = 0, let G (k) evolve until time t = rn c+2 ; the resulting temporal graph has at most t m time-edges 2: Run the foremost s-y journey algorithm of [3] in this temporal graph 3: return the arrival time of the computed foremost journey The probability that Exp fails to connect s to y via a journey is equal to the probability that s is not connected to y until time t . Therefore, Lemma 4 implies that the time to connect s to y exceeds the expectation E(X) of X by a multiplicative factor of at least rn. By Markov's inequality, this probability of failure is at most 1 rn . For now, we proceed the analysis of the algorithm assuming that all experiments succeed, and we will take the probability of failure of some experiment(s) into account later on.
Let X i be the random variable returned by the ith execution of the experiment Exp, where i = 1, 2, . . . , r, and let X = 1 r (X 1 + . . . + X r ) be the estimator for X. Note that E[ X] = E[X], meaning that X is an unbiased estimator for X. Thus, it follows by Chebyshev's inequality that, for every ε ∈ (0, 1): It holds that σ( X) = σ(X) √ r (see [40, p. 297]), hence (8) becomes: We will now show that σ(X) is upper bounded by a polynomial in n. Indeed, we have: Now, consider any s-y journey in G (k) and let be the number of its edges. Let Y 1 be the number of time steps needed until we cross the 1st edge of this journey, starting at time 0. Also let Y i be the number of time steps that are needed, after we cross the (i − 1)th edge, until we cross the ith edge of this journey, i = 2, . . . , . Then X ≤ Y 1 + . . . + Y , and thus Note that each Y i is dominated by a geometric random variable Z with success probability 1 n c . Therefore, it follows by the properties of geometric random variables that E( . So, (10) becomes: Going back to (9), it becomes: So, for a number r = 2 n 2c+3 ε 2 of experiments, we get: meaning that performing r independent times the experiment Exp results in polynomial time in a solution X that is within a factor (1 ± ε) of the optimal with probability at least 1 − 1 n . Let us call the probability that X is far from the optimal solution "probability of failure of the estimator". Recall that there is also a chance of failure in our algorithm if any of the r experiments fails. Therefore, the probability of failure of our FPRAS is: We execute the experiment Exp for r = 2 n 2c+3 ε 2 times. Each execution of the experiment runs in total for t = 2 n 3c+5 ε 2 time; during this time we get at most t m time-edges, i.e. the algorithm of [3] runs in O(t m log(t m)) = O(m n 3c+5 ε 2 · log( n ε )) time. Thus the total running time is O(m n 5c+8 ε 4 · log( n ε )).

Computing the expected arrival time of a best policy
In this section we investigate the computational complexity of our second problem, namely Best Policy.

A polynomial-time algorithm for the memoryless model
In this section we focus on the memoryless model and we derive a polynomial-time dynamic-programming algorithm for Best Policy. We define for every vertex v the expected arrival time h(v, y) y)] of the v-y journey suggested to Alice by a best policy (i.e. when Alice starts her journey at vertex v). For simplicity of presentation, throughout Section 4.1 we write h(v) def = h(v, y). Assume for now that for all v ∈ V , the value h(v) is given; let v 1 = y, v 2 , . . . , v n be an ordering of vertices of V in non-decreasing values of h (ties broken arbitrarily), namely h(v 1 ) ≤ h(v 2 ) ≤ · · · ≤ h(v n ). Clearly, v 1 = y and h(v 1 ) = h(y) = 0.
Let s t be the vertex that Alice occupied at time t and recall that Γ Gt (v) is the neighborhood of vertex v in the snapshot G t , for all v ∈ V and all t ∈ N. Notice that, the best strategy of Alice at time t + 1 is to look at all neighboring vertices of s t in G t+1 and find one with minimum h-value, namely a vertex u ∈ arg min{h(v) : v ∈ Γ Gt+1 (s t )}. If h(u) ≥ h(s t ), then Alice has no incentive to change vertex and thus s t+1 = s t . Otherwise, if h(u) < h(s t ), then s t+1 = u.
Therefore, to find the best choice for Alice, it suffices to find the values h(v), v ∈ V . In view of the above, if Alice is on vertex v i at time 0 (i.e. she is on the i-th best vertex in terms of closeness to y), she will move to the j-th best (with j < i) only if an edge appears between v i and v j in the next step, and no edge to a vertex better than v j appears (i.e. no edge between v i and v , 1 ≤ ≤ j − 1). This happens with probability (1 − p {vi,v } ) no edge to a vertex better than v i will appear, in which case Alice will stay on v i . Therefore h(v i ) can be recurrently computed by with initial condition h(v 1 ) = 0. Indeed, the above equation follows by observing that the expected length of the foremost journey to y when Alice is on v i is equal to 1 + h(v 1 ) with probability Q i,1 (which is the probability that an edge between v i and v 1 = y exists), plus 1 + h(v 2 ) with probability Q i,2 (which is the probability that and edge between v i and the second best vertex v 2 exists, but there is no edge between v i and v 1 ), and so on. In general, the above recurrence states that there is no incentive to visit vertices with larger index and also Alice will visit the smallest index vertex v j for which the edge {v i , v j } is present (otherwise, if no such edge exists, she will stay on v i ). Using the above recurrence, we can compute all values of h(v i ) by the following bottom-up dynamic programming algorithm 4 : u ← arg min Append u to L

Hardness of computation for the memory-k model, k ≥ 3
We now show that Best Policy is #P-hard for memory-3 stochastic temporal graphs on directed acyclic graphs, and consequently also for memory k ≥ 3. (here 1 means "edge exists" and 0 means "edge does not exist"). At the end of each row there is a pair of numbers (p, 1 − p) which denotes that, with the particular history of memory 3, at time step i the edge appears with probability p and it does not appear with probability 1 − p. For simplicity of notation, in the column of time step i, we write "0" and "1" to denote the entries (0, 1) and (1, 0), respectively.
To complete the description of our memory-3 instance, we specify that, in the fictitious initialization snapshots G −2 , G −1 , G 0 , each of the edges (s, x i ) and (y j , y) appears with probability 0, 0, and 1, respectively, i.e. according to the first row of the above table.
The intuition of this table for the edges (s, x i ) and (y j , y) is as follows. In the snapshot G 1 , none of these edges appears (see the first line of the table). Then, to determine whether each of these edges appears at time step 2 (see the second row of the table), we need to toss an unbiased coin which with probability 1 2 outputs "appear" and with probability 1 2 outputs "does not appear". Once this coin has been tossed at time step 2, the status of the edge does not change any more in any subsequent time step i ≥ 3. That is, if one of the edges (s, x i ) and (y j , y) appears (resp. does not appear) at time 2, then it appears (resp. does not appear) at all times i ≥ 3 too. This is easy to be verified by observing the rows 3-7 of the table. Note that the last row of the table is included only for the sake of completeness, as it does not affect the appearance of any edge of H at any time step i.
Let be the expected s-y arrival time of the best policy in the memory-3 model. Note that, from the above construction of the temporal graph instance, each of the edges (s, x i ) and (y j , y) appears with probability 1 2 at all steps i ≥ 2, while it does not appear at any step i ≥ 2 with probability 1 2 . Therefore, the probability that there exists a directed temporal path (s, x i , y j , y) is equal to g = ψ 2 n+m , where ψ is the number of satisfying truth assignments of the DNF formula Φ. That is, with probability 1 − g, there exists no such temporal path from s to y with 3 edges through some vertices x i and y j . Furthermore, the expected s-y arrival time through the edges (s, v) and (v, y) is equal to M 2 + M 2 = M . Therefore, since with probability 1 − g any policy (also the best one) needs to travel from s to y through vertex v, it follows that ≥ M (1 − g).
We now define the following policy: at time step 1 do nothing and just wait for the outcome of the random coin tosses which occur at time step 2. Subsequently, at time step 2 do the following: if there exists a directed temporal path (s, x i , y j , y) then follow it, starting at time step 2; otherwise follow the temporal path (s, v, y) which has an expected travel time M 2 + M 2 = M . The expected arrival time of this particular policy is equal to 1 + 3g + M (1 − g), and thus it follows that ≤ 1 + 3g + M (1 − g). Summarizing, we have: The first inequality can be written as 2 n+m − 5 ≤ ψ, while the second one can be written as 1 − 3 5·2 n+m ψ ≤ 2 n+m − 5 + 1 5 . Therefore: and thus Thus, knowing the expected value for the best policy we can derive the exact integer value for ψ in the counting problem #PP2DNF. This completes the #P-hardness reduction.

An exact algorithm for the memory-k model, k ≥ 1
In this section we present a doubly exponential-time exact algorithm for computing the best policy for Alice in the memory-k model, where k ≥ 1. We first give a Markov Decision Process (MDP) formulation of our problem under the memory-k model that will be useful for the presentation of our results within the general MDP framework.

An MDP formulation
We follow the notation from chapter 5.4 of [32] to give an MDP formulation. Let k ≥ 1 be a fixed integer corresponding to the memory of the model. We denote by I = V × (2 G ) k the set of states, where 2 G denotes the set of subgraphs of the underlying graph G. In particular, each state (v, H (k) ) ∈ I consists of a vertex v which corresponds to the vertex Alice is on and a sequence of k graphs H (k) corresponding to the k most recent snapshots. For any t ≥ 0, we will say that H respectively. The set of actions for Alice is the set A = V . A stationary policy for Alice is a function f : I → A and determines a probability law Pr f for a Markov chain (X t ) t≥0 with values in I as follows: (i) Assuming that at time 0 Alice starts from vertex s and the initial sequence of k snapshots is H (ii) For any t ≥ 0, Without loss of generality, we will assume that every policy f is legitimate in the sense that the following conditions hold: Alice may visit v t+1 in the next step only if G t+1 has an edge that connects v t (the vertex she is currently on) and v t+1 (the vertex she wants to go to).

B.
Recalling that the goal of Alice is to reach y, we assume that f (y, H (k) ) = y, for any H (k) , i.e. Alice will never leave her target vertex once she reaches it.
For simplicity, we will denote by a t Alice's t-th action (vertex choice). In particular, a 0 = s and inductively a t+1 = f (a t , H To complete the specification of the Markov Decision Process, we assume that constant cost c((v, H (k) ), a) = 1 is incurred when action a is chosen in state (v, H (k) ) with v = y, otherwise c((y, H (k) ), a) = 0. Therefore, to every legitimate policy f we can associate an expected total cost starting from state (a 0 , H (k) 0 ), given by h f (y, y, H (k) ) = 0, for any H (k) and, for any a 0 = y and any H To be more clear, the expectations in equation (13) are over random variables G 1 , G 2 , . . ., while the expectation in equation (14) is over G 2 , G 3 , . . .. Furthermore, equation (14) follows by conditioning on Observation 2. Any policy f guiding Alice from s to y must satisfy recurrence (15), with initial condition h f (y, y, H (k) ) = 0, for every H (k) .
Our objective is to find a policy that minimizes the expected total cost h f (a 0 , y, H (k) 0 ). In particular, this policy will have the value h * (a 0 , y, H 0 ) which will be equal to the expected arrival time of a journey suggested to Alice by an optimal policy. In fact, without loss of generality we will assume that the h * -values of an optimal policy satisfy h * (a 0 , y, H

A doubly exponential-time algorithm
We now provide our doubly exponential-time algorithm for Best Policy in the memory-k model, where k ≥ 1. In order to simplify the notation and presentation of this section, we only provide the proof of the algorithm for the special case k = 1; the analysis for arbitrary k ≥ 1 carries then easily over, as we discuss at the end of the section.
Following the notation of Section 4.3.1 for memory-1, we denote by µ(G |G ) the probability that the next snapshot is G , given that the current snapshot is G . Furthermore, for a 0 ∈ V , let h(a 0 , y, G 0 ) be the expected arrival time of a journey from a 0 to y suggested to Alice by an optimal policy, given that the starting graph instance is equal to G 0 .
We define the following policy π: For any time step t ≥ 0, if at time t Alice was on a vertex a t and at time t + 1 the graph instance is G t+1 , then at time t + 1 she will move to a vertex u ∈ Γ Gt+1 [a t ] that has minimum h(u, y, G t+1 ), that is, By part (ii) of Theorem 5.4.3 of [32], we have the following: Notice that in the definition of π, we assumed that the h-values are given. Therefore, to determine π we need to compute h(a 0 , y, G 0 ), for every a 0 ∈ V and G 0 ⊆ G. We start by rewriting recurrence (15) for policy π: h π (a 0 , y, G 0 ) = 1 + G1 µ(G 1 |G 0 )h π (π(a 0 , G 1 ), y, G 1 ).
Since π is optimal, the left hand side of the above equation is equal to h(a 0 , y, G 0 ). Furthermore, by definition of π(a 0 , G 1 ), h π (π(a 0 , G 1 ), y, G 1 ) = min {h(u, y, Therefore, recurrence (17) becomes Suppose that we know an ordering of the triplets (a 0 , y, G 0 ), a 0 ∈ V, G 0 ⊆ G, in increasing values of h(a 0 , y, G 0 ), breaking ties arbitrarily. Notice that these are n2 m def = N values, where m = |E| is the number of edges of G. Then the minimum in recurrence (19) can be replaced with the corresponding h-value, which is completely determined by the graph G 1 and the vertex a 0 . Doing this for all different vertices a 0 and graphs G 0 , we get a linear system with N equations coming from (19) and as many variables (the h-values). To this system, we then add the initial conditions h(y, y, G 0 ) = 0, for all G 0 ⊆ G. This can be solved in O(N 3 ) time.
Notice however, that for the above approach to work, we need an ordering of the triplets (a 0 , y, G 0 ) in increasing values of h(a 0 , y, G 0 ). We can therefore have the following brute-force algorithm: For each of the (at most) N ! orderings of the triplets (u 0 , y, G 0 ), a 0 ∈ V, G 0 ⊆ G, solve the linear system derived by the recurrence (19) as described above, assuming the ordering is "correct", namely it corresponds to an ordering in increasing values of h(a 0 , y, G 0 ). Then check if the ordering we get from the solution to that system is the same as the one we assumed. If not, then consider a different ordering.
By definition of σ * , the above set of constraints has at least one solution, namely the one corresponding to the h-values of an optimal policy. In Theorem 11 we prove that this is the only feasible solution. For the proof, we also need the following Theorem from [32], which we restate here in our notation for convenience: Theorem 10 (Policy increment, Theorem 5.4.4, [32]). Given one stationary policy f , let θf denote the policy that, for every a 0 , G 0 minimizes G1 µ(G 1 |G 0 )h f ((θf )(a 0 , G 1 ), y, G 1 ). Then, for all a 0 , G 0 lim k→∞ h θ k f (a 0 , y, G 0 ) = h(a 0 , y, G 0 ), (24) provided E (a0,G0) h f (a n , y, G n ) → 0 as n → ∞.
In the above, the notation θ k f means the application of policy increment k times. Furthermore, in the expectation E (a0,G0) h f (a n , y, G n ), the state (a n , G n ) is a random variable and its distribution is determined by an optimal policy, given that we start at (a 0 , G 0 ). We note that, the condition of the Theorem holds in our case by transience of the underlying Markov chain (i.e. once Alice reaches y she does not leave and no further cost is incurred after that).
Proof. We define the following policy π * (similar to the definition of π earlier, but using the h * -values instead of the h-values): For any time step t ≥ 0, if at time t Alice was on a vertex a t and at time t + 1 the graph instance is G t+1 , then at time t + 1 she will move to a vertex u ∈ Γ Gt+1 [a t ] that has minimum h * (u, y, G t+1 ), that is, π * (a t , G t+1 ) def = a t+1 ∈ arg min h * (u, y, G t+1 ) : u ∈ Γ Gt+1 [a t ] .
Therefore, by definition, π * itself is a policy that minimizes the above sum, and so we can take θπ * = π * . Consequently, no improvement by increment is possible, implying that π * is optimal. In particular, (h * (a 0 , y, G 0 ) : a 0 ∈ V, G 0 ⊆ G) is the same as (h(a 0 , y, G 0 ) : a 0 ∈ V, G 0 ⊆ G), and the proof is completed.
The set of constraints (20) to (23) has N = n2 m variables, namely {h (a 0 , y, G 0 ) : a 0 ∈ V, G 0 ⊆ G}. Furthermore, there are (n − 1)2 m constraints of the form (20), at most n 2 2 m constraints of the form (21) and n2 m non-negativity and initialization constraints, i.e. O(nN ) constraints in total. Therefore, Vaydia's algorithm for linear programming [38] can find an optimum solution in O((nN ) 2.5 ) time. Since we need to solve this set of constraints for every possible ordering of the N different triplets (a 0 , y, G 0 ), our brute-force approach runs in O(N ! (nN ) 2.5 ) = O(N N ) time.
The above analysis for the memory-1 model directly carries over to the memory-k model, for any k ≥ 1. Indeed, the correctness proof can be slightly modified by replacing everywhere the subgraphs G t of G by the length-k histories H (k) t , respectively. Furthermore, the running time analysis carries over to the case of an arbitrary k ≥ 1 by replacing N = n2 m by N = n2 km . Summarizing, we obtain the following theorem.
Theorem 12. Let k ≥ 1 and G (k) be a stochastic temporal graph, where the underlying graph G has n vertices and m edges. Then Best Policy can be solved on G (k) in O(2 (kmn+n log n)·2 km ) time.
Remark 1. It is easy to see that the running time of the above brute-force algorithm is dominated by the number of different orderings N !, and thus we have a doubly exponential algorithm (recall that N = n2 km ). A different approach that can potentially lead to a faster algorithm is to start from an arbitrary initial policy and successively apply policy increments as in Theorem 10. Even though the convergence analysis of such an approach is non-trivial, one could use it to find the optimal ordering σ * fast 5 and then use σ * to find the unique solution to the set of constraints (20)- (23).