Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

Given a static reference string $R$ and a source string $S$, a relative compression of $S$ with respect to $R$ is an encoding of $S$ as a sequence of references to substrings of $R$. Relative compression schemes are a classic model of compression and have recently proved very successful for compressing highly-repetitive massive data sets such as genomes and web-data. We initiate the study of relative compression in a dynamic setting where the compressed source string $S$ is subject to edit operations. The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. We present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates. To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for these problems. Plugging in our new results we also immediately obtain new bounds for the string indexing for patterns with wildcards problem and the dynamic text and static pattern matching problem.


Introduction
Given a static reference string R and a source string S, a relative compression of S with respect to R is an encoding of S as a sequence of references to substrings of R. Relative compression (or external macro compression) is a classic model of compression defined by Storer and Szymanski [38,39] in 1978 and has since been used in a wide range of compression scenarios [5,9,21,26,27,29,30]. To compress massive highly-repetitive data sets, such as biological sequences and web collections, relative compression has been shown to be very practical [21,26,27].
Relative compression is often applied to compress multiple similar source strings. In such settings relative compression is superior to compressing the source strings individually. For instance, human genomes are 99% similar and hence relative compression might be used to compress a large collection of sequenced genomes using, e.g., the human reference genome as the static reference string. We focus on the case of compressing a single source string, but our results trivially generalize to compressing multiple source strings.
In this paper we initiate the study of relative compression in a dynamic setting, where the compressed source string S is subject to edit operations (insertions, deletions, and replacements of single characters). The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. Efficient data structures supporting these operations allow us to avoid costly recompression of massive data sets after updates.
We provide the first non-trivial bounds for this problem. We present new data structures, that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates.
To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for both of these problems (see detailed discussion below). Furthermore, plugging in our new results immediately leads to new bounds for the string indexing for patterns with wildcards problem [4,28] and the the dynamic text and static pattern matching problem [2].

Dynamic Relative Compression
Given a reference string R and a source string S, a relative compression of S with respect to R is a sequence C = (i 1 , j 1 ), ..., (i |C| , j |C| ) such that S = R[i 1 , j 1 ] · · · R[i |C| , j |C| ]. We call C a substring cover for S. The substring cover is optimal if |C| is minimum over all relative compressions of S with respect to R. The dynamic relative compression problem is to maintain a relative compression of S under the following operations. Let i be a position in S and α be a character. Note that operations insert and delete change the length of S by a single character. In all bounds below, the access(i) operation extends to decompressing an arbitrary substring of length ℓ using only O(ℓ) additional time.
Our Results Throughout the paper, let r be the length of the reference string R, N be the length of the (uncompressed) string S, and n be the size of an optimal relative compression of S with regards to R. All of the bounds mentioned below and presented in this paper hold for a standard unit-cost RAM with w-bit words with standard arithmetic and logical operations on a word. This means that the algorithms can be implemented directly in standard imperative programming languages such as C [25] or C++ [40]. An index into R or S can be stored in a single word and hence w ≥ log(n + r). Theorem 1. Let R and S be a reference and source string of lengths r and N , respectively, and let n be the length of the optimal substring cover of S by R. Then, we can solve the dynamic relative compression problem supporting access, replace, insert, and delete (i) in O(n + r) space and O log n log log n + log log r time per operation, or (ii) in O(n + r log ǫ r) space and O log n log log n time per operation, for any constant ǫ > 0.
These are the first non-trivial bounds for the problem. Together, the bounds are optimal for most natural parameter combinations. In particular, any data structure for a string of length N supporting access, insert, and delete must use Ω(log N/ log log N ) time in the worst-case regardless of the space [14] (this is called the list representation problem). Since n ≤ N , we can view O(log n/ log log n) as a compressed version of the optimal time bound that is always O(log N/ log log N ) and better when S is compressible. Hence, Theorem 1(i) provides a linear-space solution that achieves the compressed time bound except for an O(log log r) additive term. Note that whenever n ≥ (log r) log ǫ log r , for any ǫ > 0, the log n/ log log n term dominates the query time and we match the compressed time bound. Hence, Theorem 1(i) is only suboptimal in the special case when n is almost exponentially smaller than r. In this case, we can use Theorem 1(ii) which always provides a solution achieving the compressed time bound at the cost of increasing the space to O(n + r log ǫ r). We note that dynamic compression under different models of compression has been studied extensively [11-13, 18, 24, 32, 37]. However, all of these results require space dependent on the size of the original string and hence cannot take full advantage of highly-repetitive data.

Dynamic Partial Sums
The partial sums problem is to maintain an array Z[1..s] under the following operations.
To ensure well-defined answers, we require that Z[i] ≥ 0 for all i.
The partial sums problem is a classic and well-studied problem [8,10,14,20,22,23,34,36]. In our context, we consider the problem in the word RAM model, where each array entry stores a w-bit integer and the element of the array can be changed by δ-bit integers, i.e., the argument ∆ can be stored in δ bits. In this setting, Pȃtraşcu and Demaine [34] gave a linearspace data structure with Θ(log s/ log(w/δ)) time per operation. They also gave a matching lower bound. We consider the following generalization supporting dynamic changes to the array. The dynamic partial sums problems is to additionally support the following operations. Hon et al. [20] and Navarro and Sadakane [33] presented optimal solutions for this problem in the case where the entries in Z are at most polylogarithmic in s (they did not explicitly consider the merge and divide operation).

Our Results
We show the following improved result.
Theorem 2. Given an array of length s storing w-bit integers and parameter δ, such that ∆ < 2 δ , we can solve the dynamic partial sums problem supporting sum, update, search, insert, delete, merge, and divide in linear space and O(log s/ log(w/δ)) time per operation.
Note that this bound simultaneously matches the optimal time bound for the standard partial sums problem and supports storing arbitrary w-bit values in the entries of the array, i.e., the values we can handle in optimal time are exponentially larger than in the previous results.
To achieve our bounds we extend the static solution by Pȃtraşcu and Demaine [34]. Their solution is based on storing a sampled subset of representative elements of the array and difference encode the remaining elements. They pack multiple difference encoded elements in words and then apply word-level parallelism to speedup the operations. To support insert and delete the main challenge is to maintain the representative elements that now dynamically move within the array. We show how to efficiently do this by combining a new representation of representative elements with a recent result by Pȃtraşcu and Thorup [35]. Along the way we also slightly simplify the original construction by Pȃtraşcu and Demaine [34].

Substring Concatenation
Let R be a string of length r. A substring concatenation query on R takes two pairs of indices (i, j) and (i ′ , j ′ ) and returns the start position in R of an occurrence of R[i, j]R[i ′ , j ′ ], or NO if the string is not a substring of R. The substring concatenation problem is to preprocess R into a data structure that supports substring concatenation queries.
Amir et al. [2] gave a solution using O(r √ log r) space with query time O(log log r), and recently Gawrychowski et al. [16] showed how to solve the problem in O(r log r) space and O(1) time.
Our Results We give the following improved bounds. To achieve the bound in (i), the main idea is a new construction that efficiently combines compact data structure for 1D range reporting [3] with the recent constant time weighted level ancestor data structure for suffix trees [16]. The bound in (ii) follows as a simple implication of another recent result for unrooted LCP queries [4] by some of the authors. The substring concatenation problem is a key component in several solutions to the string indexing for patterns with wildcards problem [4,6,28], where the goal is to preprocess a string T to support pattern matching queries for patterns with wildcards. Plugging in Theorem 3(i) we immediately obtain the following new bound for the problem. Corollary 1. Let T be a string of length t. For any pattern string P of length p with k wildcards, we can support pattern matching queries on T using O(t log ǫ t) space and O(p + σ k ) time for any constant ǫ > 0.
This improves the running time of fastest linear space solution by a factor log log t at the cost of increasing the space slightly by a factor log ǫ t. See [28] for detailed overview of the known results.

Extensions
Finally, we present two extensions of the dynamic relative compression problem. The proofs of these extensions are included in the appendix.

Dynamic Relative Compression with Access and Replace
If we restrict the operations to access and replace we obtain the following improved bound.
Theorem 4. Let R and S be a reference and source string of lengths r and N , respectively, and let n be the length of the optimal substring cover of S by R. Then, we can solve the dynamic relative compression problem supporting access and replace in O(n + r) space and O(log log N ) expected time.
This version of dynamic relative compression is a key component in the dynamic text and static pattern matching problem, where the goal is to efficiently maintain a set of occurrences of a pattern P in a text T that is dynamically updated by changing individual characters. Let p and t denote the lengths of P and T , respectively. Amir et al. [2] gave a data structure using O(t + p √ log p) space which supports updates in O(log log p) time. The computational bottleneck in the update operation is to update a substring cover of size O(p). Plugging in the bounds from Theorem 4, we immediately obtain the following improved bound.
Corollary 2. Given a pattern P and text T of lengths p and t, respectively, we can solve the dynamic text and static pattern matching problem in O(t + p) space and O(log log p) expected time per update.
Hence, we match the previous time bound while improving the space to linear.

Dynamic Relative Compression with Split and Concatenate
We also consider maintaining a set of compressed strings under split and concatenate operations (as in Alstrup et al. [1]). Let R be a reference string and let S = {S 1 , . . . , S k } be a set of strings compressed relative to R. In addition to access, replace, insert and delete we also define the following operations.
concat(i, j): Add string S i · S j to S and remove S i and S j .
We obtain the following bounds.
Theorem 5. Let R be a reference string of length r, let S = {S 1 , . . . , S k } be a set of source strings of total length N , and let n be the total length of the optimal substring covers of the strings in S. Then, we can solve the dynamic relative compression problem supporting access, replace, insert, delete, split, and concat, Hence, compared to the bounds in Theorem 1 we only increase the time bounds by an additional log log n factor.

Dynamic Relative Compression
In this section we show how Theorems 2 and 3 lead to Theorem 1. The proofs of Theorems 2 and 3 appear in Section 3 and Section 4, respectively.
Let C = ((i 1 , j 1 ), ..., (i |C| , j |C| )) be the compressed representation of S. From now on, we refer to C as the cover of S, and call each element (i l , j l ) in C a block. Recall that a block (i l , j l ) refers to a substring R[i l , j l ] of R. A cover C is maximal if concatenating any two consecutive blocks (i l , j l ), (i l+1 , j l+1 ) in C yields a string that does not occur in R, i.e., the string R[i l , j l ]R[i l+1 , j l+1 ] is not a substring of R. We need the following lemma. Lemma 1. If C max is a maximal cover and C is an arbitrary cover of S, then |C max | ≤ 2|C| − 1.
Proof. In each block b of C there can start at most two blocks in C max , because otherwise two adjacent blocks in C max would be entirely contained in the block b, contradicting the maximality of C max . Since the last block of both C and C max end at the last position of S, a contradiction of the maximality is already obtained when more than one block of C max start in the last block of C. Hence, |C max | ≤ 2|C| − 1.
Recall that n is the size of an optimal cover of S with regards to R. The lemma implies that we can maintain a compression of size at most 2n − 1 by maintaining a maximal cover of S. The remainder of this section describes our data structure for maintaining and accessing such a cover.
Initially, we can use the suffix tree of R to construct a maximal cover of S in O(N + r) time by greedily matching the maximal prefix of the remaining part of S with any suffix of R. This guarantees that the blocks constitute a maximal cover of S.

Data Structure
The high level idea for supporting the operations on S is to store the sequence of block lengths j 1 − i 1 + 1, . . . , j |C| − i |C| + 1 in a dynamic partial sums data structure. This allows us, for example, to identify the block that encodes the k th character in S by performing a search(k) query.
Updates to S are implemented by splitting a block in C. This may break the maximality property so we use substring concatenation queries on R to detect if blocks can be merged. We only need a constant number of substring concatenation queries to restore maximality. To maintain the correct sequence of block lengths we use update, divide and merge operations on the dynamic partial sums data structure.
Our data structure consist of the string R, a substring concatenation data structure of Theorem 3 for R, a maximal cover C for S stored in a doubly linked list, and the dynamic partial sums data structure of Theorem 2 storing the block lengths of C. We also store auxiliary links between a block in the doubly linked list and the corresponding block length in the partial sums data structure, and a list of alphabet symbols in R with the location of an occurrence for each symbol. By Lemma 1 and since C is maximal we have |C| ≤ 2n−1 = O(n). Hence, the total space for C and the partial sums data structure is O(n). The space for R is O(r) and the space for substring concatenation data structure is either O(r) or O(r log ǫ r) depending on the choice in Lemma 3. Hence, in total we use either O(n + r) or O(n + r log ǫ r) space.

Answering Queries
To answer access(i) queries we first compute search(i) in the dynamic partial sums structure to identify the block b l = (i l , j l ) containing position i in S. The local index in R[i l , j l ] of the i th character in R is ℓ = i − sum(l − 1), and thus the answer to the query is the character We perform replace and delete by first identifying b l = (i l , j l ) and ℓ as above. Then we partition b l into three new blocks l is the single character block for index i in S that we must change. In replace we change b 2 l to an index of an occurrence in R of the new character (which we can find from the list of alphabet symbols), while we remove b 2 l in delete. The new blocks and their neighbors, that is, b l−1 , b 1 l , b 2 l , b 3 l , and b l+1 may now be non-maximal. To restore maximality we perform substring concatenation queries on each consecutive pair of these 5 blocks, and replace nonmaximal blocks with merged maximal blocks. All other blocks are still maximal, since the strings obtained by concatenating b l ′ with b l ′ +1 , for all l ′ < l − 1 and all l ′ > l, was not present in R before the change and is not present afterwards. A similar idea is used by Amir et al. [2]. We perform update, divide and merge operations to maintain the corresponding lengths in the dynamic partial sums data structure. The insert operation is similar, but inserts a new single character block between two parts of b l before restoring maximality. Observe that using δ = O(1) bits in update is sufficient to maintain the correct block lengths.
In total, each operation requires a constant number of substring concatenation queries and dynamic partial sums operations; the latter having time complexity O(log n/ log(w/δ)) = O(log n/ log log n) as w ≥ log n and δ = O(1). Hence, the total time for each access, replace, insert, and delete operation is either O(log n/ log log n+log log r) or O(log n/ log log n) depending on the substring concatenation data structure used. In summary, this proves Theorem 1.

Dynamic Partial Sums
In this section we prove Theorem 2. We support the operations insert(i, ∆) and delete(i) on a sequence of w-bit integer keys by implementing them using update and a divide or merge operation, respectively. This means that we support inserting or deleting keys with value at most 2 δ .
We first solve the problem for small sequences. The general solution uses a standard reduction, storing Z at the leaves of a B-tree of large outdegree. We use the solution for small sequences to navigate in the internal nodes of the B-tree.
Dynamic Integer Sets We need the following recent result due to Pȃtraşcu and Thorup [35] on maintaining a set of integer keys X under insertions and deletions. The queries are as follows, where q is an integer. The membership query member(q) returns true if q ∈ X, predecessor pred X (q) returns the largest key x ∈ X where x < q, and successor succ X (q) returns the smallest key x ∈ X where x ≥ q. The rank rank X (q) returns the number of keys in X smaller than q, and select(i) returns the i th smallest key in X.
Lemma 2 (Pȃtraşcu and Thorup [35]). There is a data structure for maintaining a dynamic set of w O(1) w-bit integers that supports insert, delete, membership, predecessor, successor, rank and select in constant time per operation.

Dynamic Partial Sums for Small Sequences
Let Z be a sequence of at most B ≤ w O(1) integer keys. We will show how to store Z in linear space such that all dynamic partial sums operations can be performed in constant time. We let Y be the sequence of prefix sums of Z, defined such that each key Y [i] is the sum of the first i keys in Z, i.e., and search(t) is the index of the successor of t in Y . Our goal is to store and maintain a representation of Y subject to the dynamic operations update, divide and merge in constant time per operation.

The Scheme by Pȃtraşcu and Demaine
We first review the solution to the static partial sums problem by Pȃtraşcu and Demaine [34], slightly simplified due to Lemma 2. Our dynamic solution builds on this.
The entire data structure is rebuilt every B operations as follows. We first partition Y greedily into runs. Two adjacent elements in Y are in the same run if their difference is at most B2 δ , and we call the first element of each run a representative for all elements in the run. We use R to denote the sequence of representative values in Y and rep(i) to be the index of the representative for element Y [i] among the elements in R.
We store Y by splitting representatives and other elements into separate data structures: I and R store the representatives at the time of the last rebuild, while U stores each element in Y as an offset to its representative value as well as updates since the last rebuild. We ensure Y [i] = R[rep(i)] + U [i] for any i and can thus reconstruct the values of Y .
The representatives are stored as follows. I is the sequence of indices in Y of the representatives and R is the sequence of representative values in Y . Both I and R are stored using the data structure of Lemma 2. We can then define rep(i) = rank I (pred I (i)) as the index of the representative for i among all representatives, and use R[rep(i)] = select R (rep(i)) to get the value of the representative for i.
We store in U the current difference from each element to its representative, updates between rebuilds are applied to U ). The idea is to pack U into a single word of B elements. Observe that update(i, ∆) adds value ∆ to all elements in Y with index at least i. We can support this operation in constant time by adding to U a word that encodes ∆ for those elements. Since each difference between adjacent elements in a run is at most B2 δ and |Y | = O(B), the maximum value in U after a rebuild is O(B 2 2 δ ). As B updates of size 2 δ may be applied before a rebuild, the changed value at each element due to updates is O(B2 δ ). So each element in U requires O(log B + δ) bits (including an overflow bit per element). Thus, U requires O(B(log B + δ)) bits in total and can be packed in a single word for B = O(min{w/ log w, w/δ}).
Between rebuilds the stored representatives are potentially outdated because updates may have changed their values. However, observe that the values of two consecutive representatives differ by more than B2 δ at the time of a rebuild, so the gap between two representatives cannot be closed by B updates of δ bits each (before the structure is rebuilt again). Hence, an answer to search(t) cannot drift much from the values stored by the representatives; it can only be in a constant number of runs, namely those with a representative value succ R (t) and its two neighboring runs. In a run with representative value v, we find the smallest j (inside the run) such that U [j] + v − t > 0. The smallest j found in all three runs is the answer to the search(t) query. Thus, by rebuilding periodically, we only need to check a constant number of runs when answering a search(t) query.
On this structure, Pȃtraşcu and Demaine [34] show that the operations sum, search and update can be supported in constant time each as follows: To count the number of negative elements or find the least significant bit in a word in constant time, we use the technique by Fredman and Willard [15]. Notice that rebuilding the data structure every B operations takes O(B) time, resulting in amortized constant time per operation. We can instead do this incrementally by a standard approach by Dietz [8], reducing the time per operation to worst case constant. The idea is to construct the new replacement data structure incrementally while using the old and complete data structure.

Efficient Support for divide and merge
We now show how to maintain the structure described above while supporting operations divide(i, t) and merge(i). An example supporting the following explanation is provided in Figure 1.
Observe that the operations are only local: Splitting Z[i] into two parts or merging Z[i] and Z[i + 1] does not influence the precomputed values in Y (besides adding/removing values for the divided/merged elements). We must update I, R and U to reflect these local changes accordingly. Because a divide or merge operation may create new representatives between rebuilds with values that do not fit in U , we change I, R and U to reflect these new representatives by rebuilding the data structure locally. This is done as follows.
Consider the run representatives. Both divide(i, t) and merge(i) may require us to create a new run, combine two existing runs or remove a run. In any case, we can find a replacement representative for each run affected. As the operations are only local, the replacement is either a divided or merged element, or one of the neighbors of the replaced representative. Replacing representatives may cause both indices and values for the stored representatives to change. We use insertions and deletions on R to update representative values. a) The initial data structure constructed from Z. New index 9 Old index 9 b) The result of divide(8, 3) on the structure of a). Representative value 30 was removed from R. We shifted and updated U , B and C to remove the old representative and accommodate for a new element with value 2. Index containing the sum of the merged indices.
c) The result of merge(12) on the structure of c). Since the new operations change the indices of the elements, these changes must also be reflected in I. For example, a merge(i) operation decrements the indices of all elements with index larger than i compared to the indices stored at the time of the last rebuild We should in principle adjust the O(B) changed indices stored in I. The cost of adjusting the indices accordingly when using Lemma 2 to store I is O(B). Instead, to get our desired constant time bounds, we represent I using a resizable data structure with the same number of elements as Y that supports this kind of update. We must support select I (i), rank I (q), and pred I (q) as well as inserting and deleting elements in constant time. Because I has few and small elements, we can support the operations in constant time by representing it using a bitstring B and a structure C which is the prefix sum over B as follows.
Let B be a bitstring of length |Y | ≤ B, where B[i] = 1 iff there is a representative at index i. C has |Y | elements, where C[i] is the prefix sum of B including element i. Since C requires O(B log B) bits in total we can pack it in a single word. We answer queries as follows: rank I (q) equals C[q − 1], we answer select I (i) by subtracting i from all elements in C and return one plus the number of elements smaller than 0 (as done in U when answering search), and we find pred I (q) as the index of the least significant bit in B after having masked all indices larger than q. Updates are performed as follows. Using mask, shift and concatenate operations, we can ensure that B and C have the same size as Y at all times (we extend and shrink them when performing divide and merge operations). Inserting or deleting a representative is to set a bit in B, and to keep C up to date, we employ the same ±1 update operation as used in U .
We finally need to adjust the relative offsets of all elements with a changed representative in U (since they now belong to a representative with a different value). In particular, if the representative for U [j] changed value from v to v ′ , we must subtract v ′ − v from U [j]. This can be done for all affected elements belonging to a single representative simultaneously in U by a single addition with an appropriate bitmask (update a range of U ). Note that we know the range of elements to update from the representative indices. Finally, we may need to insert or delete an element in U , which can be done easily by mask, shift and concatenate operations on the word U . This leads to Theorem 6. Theorem 6. There is a linear space data structure for dynamic partial sums supporting each operation search, sum, update, insert, delete, divide, and merge on a sequence of length O(min{w/ log w, w/δ}) in worst-case constant time.

Dynamic Partial Sums for Large Sequences
Willard [43] (and implicitly Dietz [8]) showed that a leaf-oriented B-tree with out-degree B of height h can be maintained in O(h) worst-case time if: 1) searches, insertions and deletions take O(1) time per node when no splits or merges occur, and 2) merging or splitting a node of size B requires O(B) time.
We use this as follows, where Z is our integer sequence of length s. Create a leaforiented B-tree of degree B = Θ(min{w/ log w, w/δ}) storing Z in the leaves, with height h = O(log B n) = O(log n/ log(w/δ)). Each node v uses Theorem 6 to store the O(B) sums of leaves in each of the subtrees of its children. Searching for t in a node corresponds to finding the successor Y [i] of t among these sums. Dividing or merging elements in Z corresponds to inserting or deleting a leaf. This concludes the proof of Theorem 2.

Substring Concatenation
In this section we prove Theorem 3. Recall that we must store a string R subject to substring concatenation queries: given two strings x and y return the location of an occurrence of xy in R or NO if no such occurrence exist.
To prove (i) we need the following definitions. For a substring x of R, let S(x) denote the suffixes of R that have x as a prefix, and let S ′ (x) = {i + |x| | i ∈ S(x) ∧ i + |x| ≤ n}, i.e., S ′ (x) are the suffixes of R that are immediately preceded by x. Hence for two substrings x and y, the suffixes that have xy as a prefix are exactly S ′ (x) ∩ S(y). We can reduce this intersection problem to a 1D range emptiness problem as follows.
Let rank(i) be the position of suffix R[i..r] in the lexicographic ordering of all suffixes of R, and let rank(A) = {rank(i) | i ∈ A} for A ⊆ {1..n}. Then xy is a substring of R if and only if rank(S ′ (x)) ∩ rank(S(y)) = ∅. Note that rank(S(y)) is a range [a, b] ⊆ [1, n], and we can determine this range in constant time for any substring y using a constant-time weighted ancestor query on the suffix tree of R [16]. Consequently, we can decide if xy is a substring of R by a 1D range emptiness query on the set rank(S ′ (x)).
Belazzougui et al. [3] (see also [17]) recently gave a 1D range emptiness data structure for a set A ⊆ [1, r] using O(|A| log ǫ r) bits of space, for any constant ǫ > 0, and answering queries in constant time. We will build this data structure for rank(S ′ (x)), but doing so for all substrings would require spaceΩ(r 2 ).
To arrive at the space bound of O(r log ǫ r) (words), we employ a heavy path decomposition [19] on the suffix tree of R, and only build the data structure for substrings of R that correspond to the top of a heavy path. In this way, each suffix will appear in at most log r such data structures, leading to the claimed O(r log ǫ r) space bound (in words). In addition, we build a O(r)-space nearest common ancestor data structure [19] for the suffix tree of R. Constant-time nearest common ancestor queries will allow us to also answer longest common prefix queries on R in constant time.
To answer a substring concatenation query with substrings x and y, we first determine how far y follows the heavy path in the suffix tree from the location where x stops. This can be done in O(1) time by a constant-time longest common prefix query between two suffixes of R. We then proceeed to the top of the next heavy path, where we query the 1D range reporting data structure with the range rank(S(y ′ )) where y ′ is the remaining unmatched suffix of y. This completes the query, and the proof of (i).
The second solution (ii) is an implication of a result by Bille et al. [4]. Given the suffix tree ST R of R, an unrooted longest common prefix query [6] takes a suffix y and a location ℓ in ST R (either a node or a position on an edge) and returns the location in ST S that is reached after matching y starting from location ℓ. A substring concatenation query is straightforward to implement using two unrooted longest common prefix queries, the first one starting at the root, and the second starting from the location returned by the first query. It follows from Bille et al. [4] that we can build a linear space data structure that supports unrooted longest common prefix queries in time O(log log r) thus completing the proof of (ii).

Conclusion
We have shown how to compress a text relatively to a reference string while supporting access to the text and a range of dynamic operations under some strong guarantees for the space usage and the query times. There are, however, room for improvement.
Our solution to DRC is built on data structures for the partial sums problem and the substring concatenation problem. Our partial sums-solution is optimal, but in order to get the desired constant query time for substring concatenation, our data structure uses O(r log ǫ r) space. As opposed to this, our linear space solution leads to O(log log r) query time. We leave as an open problem if it is possible to get O(1) time substring concatenation queries using O(r) space, which will also carry over to a stronger result for the DRC problem.
Moreover, the size of the cover that is maintained by our DRC data structure is also an interesting parameter. Currently we maintain a 2-approximation of the optimal cover. It would be useful to know if a better approximation ratio can be maintained under the same (or better) time and space bounds that we give.

A.1 Dynamic Relative Compression with Access and Replace
In this setting we constrain the operations on S to access(i) and replace(i, α). Then, instead of maintaining a dynamic partial sums data structure over the lengths of the substrings in C, we only need a dynamic predecessor data structure over the prefix sums. The operations are implemented as before, except that for access(i) we obtain block b j by computing the predecessor of i in the predecessor data structure, which also immediately gives us access to the local index in b j . For replace(i, α), a constant number of updates to the predecessor data structure is needed to reflect the changes. We use substring concatenation queries to restore maximality as described in Section 2. The prefix sums of the subsequent blocks in C are preserved since |b j | = |b 1 j | + |b 2 j | + |b 3 j |. With a linear space implementation of the van Emde Boas data structure [31,41,42] we can support the predecessor queries and updates in O(log log N ) expected time. For substring concatenation we apply Theorem 3(ii) using O(r) space and O(log log r). Since the length of source string does not change, we can always assume that r > N , and the total time becomes O(log log N + log log r) = O(log log N ). In summary, this proves Theorem 4.

A.2 Dynamic Relative Compression with Split and Concatenate
Consider the variant of the dynamic relative compression problem where we want to maintain a relative compression of a set of strings S 1 , . . . , S k . Each string S i has a cover C i and all strings are compressed relative to the same string R. In this setting n = k i=1 |C i |. In addition to the operations access, replace, insert, and delete, we also want to support split and concatenation of strings. Note that the semantics of the operations change to indicate the string(s) to perform a given operation on.
We build a leaf-oriented height-balanced binary tree T i (e.g. an AVL tree or red-black tree) over the blocks C i [1], . . . , C i [|C i |] for each string S i . In each internal node v, we store the sum of the block sizes represented by its leaves. Since the total number of blocks is n, the trees use O(n) space. All operations rely on the standard procedures for searching, inserting, deleting, splitting and joining height-balanced binary trees. All of these run in O(log n) time for a tree of size n. See for example [7] for details on how red-black trees achieve this.
The answer to an access(i, j) query is found by doing a top-down search in T i using the sums of block sizes to navigate. Since the tree is balanced and the size of the cover is at most n, this takes O(log n) time. The operations replace(i, j, α), insert(i, j, α), and delete(i, j) all initially require that we use access(i, j) to locate the block containing the j-th character of S i . To reflect possible changes to the blocks of the cover, we need to modify the corresponding tree to contain more leaves and restore the balancing property. Since the number of nodes added to the tree is constant these operations each take O(log n) time. The concat(i, j) operation requires that we join two trees in the standard way and restore the balancing property of the resulting tree. For the split(i, j) operation we first split the block that contains position j such that the j-th character is the trailing character of a block. We then split the tree into two trees separated by the new block. This takes O(log n) time for a height-balanced tree.
To finalize the implementation of the operations, we must restore the maximality property of the affected covers as described in Section 2. At most a constant number of blocks are nonmaximal as a result of any of the operations. If two blocks can be combined to one, we delete the leaf that represents the rightmost block, update the leftmost block to reflect the change, and restore the property that the tree is balanced. If the tree subsequently contains an internal