Streaming Pattern Matching with d Wildcards

In the pattern matching with $d$ wildcards problem one is given a text $T$ of length $n$ and a pattern $P$ of length $m$ that contains $d$ wildcard characters, each denoted by a special symbol $'?'$. A wildcard character matches any other character. The goal is to establish for each $m$-length substring of $T$ whether it matches $P$. In the streaming model variant of the pattern matching with $d$ wildcards problem the text $T$ arrives one character at a time and the goal is to report, before the next character arrives, if the last $m$ characters match $P$ while using only $o(m)$ words of space. In this paper we introduce two new algorithms for the $d$ wildcard pattern matching problem in the streaming model. The first is a randomized Monte Carlo algorithm that is parameterized by a constant $0\leq \delta \leq 1$. This algorithm uses $\tilde{O}(d^{1-\delta})$ amortized time per character and $\tilde{O}(d^{1+\delta})$ words of space. The second algorithm, which is used as a black box in the first algorithm, is a randomized Monte Carlo algorithm which uses $O(d+\log m)$ worst-case time per character and $O(d\log m)$ words of space.


Introduction
We investigate the pattern matching with d wildcards problem (PMDW) in the streaming model. Let Σ be an alphabet and let ? / ∈ Σ be a special character called the wildcard character which matches any character in Σ. The PMDW problem is defined as follows. Given a text string T = t 0 t 1 . . . t n−1 over Σ and a pattern string P = p 0 p 1 . . . p m−1 over alphabet Σ ∪ {?} such that P contains exactly d wildcard characters, report all of the occurrences of P in T . This definition of a match is one of the most well studied problems in pattern matching [22,35,26,28,19,10].
The streaming model. The advances in technology over the last decade and the massive amount of data passing through the internet has intrigued and challenged computer scientists, as the old models of computation used before this era are now less relevant or too slow. To this end, new computational models have been suggested to allow computer scientists to tackle these technological advances. One prime example of such a model is the streaming model [1,25,34,29]. Pattern matching problems in the streaming model are allowed to preprocess P into a data structure that uses space that is sublinear in m (notice that space usage during the preprocessing phase itself is not restricted). Then, the text T is given online, one character at a time, and the goal is to report, for every integer α ≥ m − 1, whether t α−m+1 . . . t α matches P . This reporting must take place before t α+1 arrives. Throughout this paper we let α denote the index of the last text character that has arrived.
Following the breakthrough result of Porat and Porat [36], recently there has been a rising interest in solving pattern matching problems in the streaming model [7,20,33,8,27,14,15]. However, this is the first paper to directly consider the important wildcard variant.
Related work. Notice that one way for solving PMDW (not necessarily in the streaming model), is to treat ? as a regular character, and then run an algorithm that finds all occurrences of P (that does not contain any wildcards) in T with up to k = d mismatches. This is known as the k-mismatch problem [32,37,2,13,12,17,15]. The most recent result by Clifford et al. [15] for the k-mismatch problem in the streaming model implies a solution for PMDW in the streaming model that uses O(d 2 polylog m) words 1 of space and O( √ d log d + polylog m) time per character. Notice that Clifford et al. [15] focused on solving the more general k-mismatch problem.
We mention that while our work is in the streaming model, in the closely related online model (see [18,16]), which is the same as the streaming model without the constraint of using sublinear space, Clifford et al. [11] presented an algorithm, known as the black box algorithm, which solves several pattern matching problems. When applied to PMDW, the black box algorithm uses O(m) words of space and O(log 2 m) time per arriving text character. In the offline model the most efficient algorithms for PMDW take O(n log m) time and were introduced by Cole and Hariharan [19] and by Clifford and Clifford [10].

New results
We improve upon the work of Clifford et al. [15], for the special case that applies to PMDW, by introducing the following algorithms (theÕ notation hides logarithmic factors). Notice that Theorem 2 improves upon the results of Clifford et al. [15] whenever δ > 1/2. We also emphasize that our proof of Theorem 2 makes use of Theorem 1.

Algorithmic Overview
Our algorithms make use of the notion of a candidate, which is a location in the last m indices of the current text that is currently considered as a possible occurrence of P . As more characters arrive, it becomes clear if this candidate is an actual occurrence or not. In general, an index continues to be a candidate until the algorithm encounters proof that the candidate is not a valid occurrence (or until it is reported as a match). The algorithm of Theorem 1 works by obtaining such proofs efficiently.
Overview of algorithm for Theorem 1. For the streaming pattern matching problem without wildcards, the algorithms of Porat and Porat [36] and Breslauer and Galil [7] have three major components 2 . The first component is a partitioning of the interval [0, m − 1] into pattern intervals of exponentially increasing lengths. Each pattern interval [i, j] corresponds to a text interval [α − j + 1, α − i + 1], where α is the index of the last text character that arrived 3 . Notice that when a new text character arrives, the text intervals are shifted by one location. The second component maintains all of the candidates in a given text interval. This implementation leverages periodicity properties of strings in order to guarantee that the candidates in a given text interval form an arithmetic progression, and thus can be maintained with constant space. The third component is a fingerprint mechanism for testing if a candidate is still valid. Whenever the border of a text interval passes through a candidate, that candidate is tested.
The main challenge in applying the above framework for patterns with wildcards comes from the lack of a good notion of periodicity which can guarantee that the candidates in a text interval form an arithmetic progression. To tackle this challenge, we design a new method for partitioning the pattern into intervals, which, combined with new fundamental combinatorial properties, leads to an efficient way for maintaining the candidates in small space. In particular, we prove that with our new partitioning there are at most O(d log m) candidates that are not part of any arithmetic progression for any text interval. Remarkably, the proof bounding the number of such candidates uses a more global perspective of the pattern, as opposed to the techniques used in non-wildcard results.
Overview of algorithm for Theorem 2. The algorithm of Theorem 2 uses the algorithm of Theorem 1 (with a minor adaptation) combined with a new combinatorial perspective on periodicity that applies to strings with wildcards. The notion of periodicity in strings (without wildcards) and its usefulness are well studied [21,31,36,7,24,23]. However, extending the usefulness of periodicity to strings with wildcards runs into difficulties, since the notions are 2 The algorithms of Porat and Porat [36] and Breslauer and Galil [7] are not presented in this way. However, we find that this new way of presenting our algorithm (and theirs) does a better job of explaining what is going on. 3 The first pattern interval starts at 0, and so the last text interval ends at location α + 1, which is a location of a text character that has yet to arrive. To understand why this convention is appropriate, notice that initially every text location should be considered as a candidate, but in order to save space we only address such candidates a moment before their corresponding character arrives since this is the first time the algorithm can obtain proof that the candidate is not a match. either too inclusive or too exclusive (see [5,4,6,9,38]). Thus, we introduce a new definition of periodicity, called the wildcard-period length that captures, for a given pattern with wildcards, the smallest possible average distance between occurrences of the pattern in any text. See Definition 6. For a string S with wildcards, we denote the wildcard-period length of S by π S .
Let P * be the longest prefix of P such that π P * ≤ d δ . The algorithm of Theorem 2 has two main components, depending on whether P * = P or not. In the case where P * = P , the algorithm takes advantage of the wildcard-period length of P being small, which, together with techniques from number theory and new combinatorial properties of strings with wildcards, allows to spend onlyÕ(1) time per character and usesÕ(d 1+δ ) words of space. This is summarized in Theorem 17. Of particular interest is Lemma 16 which combines number theory with combinatorial string properties in a new way. We expect these ideas to be useful in other applications.
If P * = P , then we use the algorithm of Theorem 17 to locate occurrences of P * , and by maximality of P * , occurrences of prefixes of P that are longer than P * must appear far apart (on average). These occurrences are given as input to a minor adaptation of the algorithm of Theorem 1 in the form of candidates. Utilizing the large average distance between candidates, we obtain anÕ(d 1−δ ) amortized time cost per character.

Periods
We assume without loss of generality that the alphabet is Σ = {1, 2, . . . , n}. For a string S = s 0 s 1 . . . s −1 over Σ and integer 1 ≤ k ≤ , the substring s 0 s 1 . . . s k−1 is called a prefix of S and s −k . . . s −1 is called a suffix of S.
A prefix of S of length i ≥ 1 is a period of S if and only if s j = s j+i for every 0 ≤ j ≤ − i − 1. The shortest period of S is called the principle period of S, and its length is denoted by ρ S . If ρ S ≤ |S| 2 we say that S is periodic. The following lemma is due to Breslauer and Galil [7].
Lemma 3 ([7, Lemma 3.1]). Let u and v be strings such that u contains at least three occurrences of v. Let t 1 < t 2 < · · · < t h be the locations of all occurrences of v in u. Assume that h ≥ 3 and that for i = 1, . . . , h − 2, we have t i+2 − t i ≤ |v|. Then, the sequence (t 1 , t 2 , . . . , t h ) forms an arithmetic progression with difference ρ v .
The following lemmas follow from Lemma 3.

Lemma 4.
Let v be a string of length and let u be a string of length at most 2 . If u contains at least three occurrences of v then the distance between any two occurrences of v in u is a multiple of ρ v and v is a periodic string.
and so c 3 − c 1 ≤ . Therefore, by Lemma 3, all the occurrences of v in u form an arithmetic progression with common difference ρ v . In particular, the distance between any two occurrences of v in u is a multiple of ρ v . Hence, Thus, by definition, v is a periodic string.
Lemma 5. Let u be a periodic string over Σ with principle period length ρ u . If v is a substring of u of length at least 2ρ u then ρ u = ρ v .
Proof. Since v is a substring of u, we have by definition that ρ u is a period length of v, and thus ρ v ≤ ρ u by the minimality of ρ v .
It only remains to prove that ρ u ≤ ρ v , which we do by showing that ρ v is a period length of u. We denote u = u 0 u 1 . . . u |u|−1 .
Let 0 ≤ i < |u| − ρ v be an index in u, we have to prove that u i = u i+ρv . Let a be an index such that v occurs in u in position a, thus u a u a+1 . . . u a+2ρu−1 is a substring of both u and v. Since ρ u is a period length of u, u i = u i+z·ρu for any z ∈ Z if 0 ≤ i + z · ρ u < |u|.
In particular, for z = a−i ρu we have that u i = u i+ a−i ρu ρu . Let b = i + z · ρ u . Notice that a ≤ b < a + ρ u and a ≤ b + ρ v < a + 2ρ u . Therefore, b and b + ρ v are both indices of characters in v, and thus u b = u b+ρv . Hence, we have that u i = u i+z·ρu = u i+z·ρu+ρv = u i+ρv , where the last equality is based again on the fact that ρ u is a period length of u.
Periods and wildcards. For a string u with no wildcards, there is an inverse relationship between the maximum number of occurrences of u in a text of a given length and the principle period length of u. Next, we define the wildcard-period length of a string over Σ ∪ {?} which captures a similar type of relationship for strings with wildcards. The usefulness of this definition for our needs is discussed in more detail in Section 6. Let occ(S , S) be the number of occurrences of a string S in a string S .

Fingerprints
For the following let u, v ∈ n i=0 Σ i be two strings of size at most n. Porat and Porat [36] and Breslauer and Galil [7] proved the existence of a sliding fingerprint function φ : n i=0 Σ i → [n c ], for some constant c > 0, which is a function where: 1. If |u| = |v| and u = v then φ(u) = φ(v) with high probability (at least 1 − 1 n c−1 ).
2. The sliding property: Let w=uv be the concatenation of u and v. If |w| ≤ n then given the length and the fingerprints of any two strings from u,v and w, one can compute the fingerprint of the third string in constant time.

A Generic Algorithm
We start with a generic algorithm (pseudo-code is given in Figure 1) for solving pattern matching problems in the streaming model. With proper implementations of the algorithm's components, the algorithm solves the PMDW problem. The generic algorithm makes use of the notion of a candidate. Initially every text index c is considered as a candidate for a pattern occurrence from the moment t c−1 arrives. An index continues to be a candidate until the algorithm encounters proof that the candidate is not a valid occurrence (or until it is reported as a match). A candidate is alive until such proof is given. The generic algorithm is composed of three conceptual parts that affect the complexities of the algorithm. An example of an execution of the generic algorithm appears in if c exists and c is valid report c as a match 6 else Q h+1 .Enqueue(c) 7 Q 0 .Enqueue(α + 1) Figure 1: Generic Algorithm. The purpose of the initialization is to consider location 0 as a candidate before any candidate has arrived. a pattern interval. For each pattern interval I = [i, j] ∈ I we define a corresponding text interval, text interval(I, α) = [α − j + 1, α − i + 1]. When character t α arrives, a text location c ∈ text interval(I, α) is a candidate if and only if t c · · · t c+i−1 matches p 0 · · · p i−1 . The candidate set C(I, α) is the set of text positions in text interval(I, α) which are candidates right after the arrival of t α .
• Candidate queues. The second conceptual part of the generic algorithm is an implementation of a candidate-queue data structure. For any interval I ∈ I, the algorithm maintains a candidate queue Q I . At any time α, which is the time right after t α arrives, but before t α+1 arrives, Q I stores a (possibly implicit) representation of C(I, α). Thus, the operations of the data structure are time-dependent. Candidate-queues support the following operations.
Definition 7. A candidate-queue for an interval [i, j] = I ∈ I supports the following operations at time, where t α is the last text character that arrived.
2. Dequeue(): remove and return a candidate c = α − j, if such a candidate exists.
Since there is a bijection between pattern intervals and text intervals we say that a candidate-queue that is associated with pattern interval I is also associated with the corresponding text interval text interval(I, α).
• Assassinating candidates. The third conceptual part addresses the following. When a new text character arrives, all the text intervals move one position ahead, and some candidates leave some text intervals and their corresponding candidate sets. The third conceptual part is a mechanism for testing if a candidate is valid after that candidate leaves a candidate set. This mechanism is used in order to determine if the candidate should enter the candidate-queue of the next text interval, or be reported as a match if there are no more text intervals.
The implementation of each of the three components controls the complexities of the algorithm. Minimizing the number of intervals reduces the number of candidates leaving text intervals at a given time. Efficient implementations of the candidate-queue operations and testing if a candidate is valid control both the space usage and the amount of time spent on each candidate that leaves an interval. Notice that the implementations of these components may depend on each other, which is also the case in our solution.  [4,7] and [8,9].   Figure 2. In each row a new text character arrives. The bold borders illustrate the text intervals. Each blue cell is a position of a candidate and the green cell corresponds to a match. When t 52 arrives, the candidate c 1 = 45 is tested, since it exits a text interval. The candidate c 1 remains alive because abababaa is a prefix of the pattern. Notice that at this time the candidate c 2 = 47 in not a valid occurrence of the pattern, but, the algorithm does not remove c 2 until c 2 reaches the end of the text interval. When t 54 arrives, the candidates c 1 = 45 and c 2 = 47 are tested, as they have reached the end of their text intervals. At this time, c 2 is removed since the text ababaaab is not a prefix of the pattern. The candidate c 1 remains alive and is reported as a match, since c 1 reached the end of the last text interval.
A naïve implementation. The following naïve implementation of the generic algorithm is helpful for gaining intuition as to how the algorithm works. Let I naïve = ([0, 0], [1,1], . . . , [m − 1, m − 1]). The implementation of candidate queue Q I explicitly stores the set C(I, α) at time α.
Notice that C(I, α) contains at most one candidate. The task of verifying that a candidate c is valid in between text intervals is a straightforward comparison of p α−c with t α . Each such comparison costs O(1) time. The runtime of the algorithm is Θ(m) time per character in the worst-case, and the space usage is also Θ(m). 4 We refer to this algorithm as the naïve algorithm.
Using fingerprints. If there are no wildcards in P , then one can use the following fingerprint based algorithm that verifies the validity of a candidate c only once all the characters t c , t c+1 , . . . , t c+m−1 have arrived. This algorithm is closely related to the Karp and Rabin [30] algorithm. The algorithm uses a partitioning of [0, m − 1] into only one interval containing all of The algorithm maintains the text fingerprint which is the fingerprint of the text from its beginning up to the last arriving character. For each text index c, just before t c arrives the algorithm creates a candidate for the index c and stores the text fingerprint φ(t 0 t 1 . . . t c−1 ) as satellite information of the candidate c. Then, c (together with its satellite information) is added to the candidate-queue via the Enqueue() operation. When the character t c+m−1 arrives, the text fingerprint is φ(t 0 . . . t c+m−1 ). At this time, the algorithm uses the Dequeue() operation to extract c together with φ(t 0 t 1 . . . t c−1 ) from the candidate-queue. Then, the algorithm tests if c is valid by computing φ(t c . . . t c+m−1 ) from the current text fingerprint φ(t 0 t 1 . . . t c+m−1 ) and the fingerprint φ(t 0 t 1 . . . t c−1 ) (using the sliding property of the fingerprint function), and then testing if φ(t c . . . t c+m−1 ) equals φ(p 0 . . . p m−1 ). The fingerprint algorithm spends only constant time per text character, but, like the naïve algorithm, uses Θ(m) words of space to store the candidate-queue.

Fingerprints with Wildcards
Using fingerprints together with wildcards seems to be a difficult task, since for any string S with x wildcards there are |Σ| x different strings over Σ that match the string S. Each one of these different strings may have a different fingerprint and therefore there are Θ(|Σ| x ) fingerprints to store, which is not feasible. In order to still use fingerprints for solving PMDW we use a special partitioning of [0, m − 1], which is described in Section 4. The partitioning in Section 4 is based on the following preliminary partitioning.
The preliminary partitioning. We use a representation of P as P = P 0 ?P 1 ? . . .?P d where each subpattern P i contains only characters from Σ (and may also be an empty string). Let W = (w 1 , w 2 , . . . , w d ) be the indices of wildcards in P such that for all 1 ≤ i < d we have is partitioned into pattern intervals as follows: Since some of the pattern intervals in this partitioning could be empty, we discard such intervals. The pattern intervals of the form [w i , w i ] are called wildcard interval s and the other pattern intervals are called regular intervals. Notice that for a text index c, the substring t c . . . t c+m−1 matches P if and only if for each regular interval A preliminary algorithm. Given the preliminary partition J , one could use the following algorithm for testing the validity of a candidate c whenever it leaves a text interval. During the initialization of the algorithm we precompute and store the fingerprints for all of the subpatterns corresponding to regular intervals. Each time a candidate c is added to a candidate-queue for interval [i, j] ∈ J via the Enqueue() operation, the algorithm stores the current text fingerprint φ(t 0 . . . t c+i−1 ) together with the candidate c. When the character t c+j arrives, the text fingerprint is φ(t 0 . . . t c+j ). At this time, the algorithm uses the Dequeue() operation to extract c together with φ(t 0 t 1 . . . t c+i−1 ) from the candidate-queue of interval [i, j]. If [i, j] is a regular interval, then the algorithm tests if c is valid, and removes (assassinates) c if it is not. This validity test is executed by applying the sliding property of the fingerprint function to compute φ(t c+i . . . t c+j ) from the current text fingerprint φ(t 0 t 1 . . . t c+j ) and the fingerprint φ(t 0 t 1 . . . t c+i−1 ), and then is a wildcard interval then c stays alive without any testing.
A naïve implementation of the candidate queues provides an algorithm that costs O(d) time per character, but uses Θ(m) words of space. To overcome this space usage we employ a more complicated partitioning, which, together with a modification of the requirements from the candidate-queues, allows us to design a data structure that uses much less space. However, this space efficiency comes at the expense of a slight increase in the time per character.

The Partitioning
The key idea of the new partitioning is to use the partitioning of Section 3.1 as a preliminary partitioning, and then perform a secondary partitioning of the regular pattern intervals, thereby creating even more regular intervals. As mentioned, the intervals are partitioned in a special way which allows us to implement candidate-queues in a compact manner (see Section 5).
The following definition is useful in the next lemma. The following lemma provides a partitioning which is used to improve the preliminary partitioning algorithm. The properties of the partitioning that are described in the statement of the lemma are essential for our new algorithm. The most essential property is property 3, since it guarantees that for each pattern interval I = [i, j], there exists a substring of P prior to p i and with no wildcards whose length is |I|. If this substring is not periodic, then for any α, C(I, α) does not contain more than two candidates. If this substring is periodic, then we show how to utilize the periodicity of the string in order to efficiently maintain all the candidates in C(I, α) for any α (see Section 5). In the proof of the lemma we introduce a specific partitioning which has all of the stated properties.
Lemma 9. Given a pattern P of length m with d wildcards, there exists a partitioning of the interval [0, m − 1] into subintervals I = (I 0 , I 1 . . . , I k ) which has the following properties: is a pattern interval then p i . . . p j either corresponds to exactly one wildcard from P (and so j = i) or it is a substring that does not contain any wildcards.
3. For each regular pattern interval I = [i, j] with |I| > 1, the length i prefix of P contains a consecutive sequence of |I| non-wildcard characters.  Figure 4: The general case: for each J h ∈ J we first create two intervals of length δ h and then we iteratively create pattern intervals where the length of each pattern interval is double the length of the previous pattern interval.
Proof. We introduce a secondary partitioning of the preliminary partitioning described in Section 3.1, and prove that the secondary partitioning has all the required properties; see  5 , and for as long as there is enough room in the remaining preliminary pattern interval J h (between the position right after the end of the last secondary pattern interval that was just created and j) we iteratively create pattern intervals where the length of each pattern interval is double the length of the previous pattern interval. Once there is no more room left in J h , let be the length of the last pattern interval we created. If the remaining part of the preliminary pattern interval is of length at most , then we create one pattern interval for all the remaining preliminary pattern interval. Otherwise we create two pattern intervals, the first pattern interval of length and the second pattern interval using the remaining part of J h .
The secondary partitioning implies all of the desired properties: Being that the secondary partitioning is a sub partitioning of the preliminary partitioning and the preliminary partitioning already had this property, then the secondary partitioning has this property as well.   If there is no such pattern interval, it must be the case that the length of I is twice the length of the pattern interval preceding I, and I is contained in a preliminary pattern interval J h for some h. Let the length of the first pattern interval created in J h be denoted by δ h . Let I h,1 , I h,2 , . . . I h,r be the first r pattern intervals created in J h such that I h,r = I. The length of any pattern interval I h,r for 1 < r ≤ r is 2 r −2 δ h (since |I h,1 | = |I h,2 | = δ h , and for 2 < r ≤ r we have |I h,r | = 2|I h,r−1 |), and in particular the length of I is 2 r−2 δ h . Recall that I = [i, j]. The length of the prefix of P h up to the index i is the sum of the lengths of all the pattern intervals I h,r for r < r. These lengths sum up to (1 + r−1 r =2 2 r −2 )δ h = 2 r−2 δ h = |I|. So the prefix of P h fulfills the requirement.

The Candidate-fingerprint-queue
The algorithm of Theorem 1 is obtained via an implementation of the candidate-queues that uses O(d log m) words of space, at the expense of having O(d + log m) intervals in the partitioning. Such space usage implies that we do not store all candidates explicitly. This is obtained by utilizing properties of periodicity in strings. Since candidates are not stored explicitly, we cannot store explicit information per candidate, and in particular we cannot explicitly store fingerprints. On the other hand, we are still interested in using fingerprints in order to perform assassinations.
To tackle this, we strengthen our requirements from the candidate-queue data structure to return not just the candidate but also the fingerprint information that is needed to perform the test of whether the candidate is still valid. For our purposes, this data structure cannot explicitly maintain all the fingerprints information. Thus, we extend the definition of a candidate-queue to a candidate-fingerprint-queue as follows. In order to reduce clutter of presentation, in the rest of this section we refer to the candidatefingerprint-queue simply as the queue.

Implementation
Our implementation of the queue assumes that we use a partitioning that has the properties stated in Lemma 9. Let I = [i, j] be a pattern interval in the partitioning and let c be a candidate from C(I, α). The entrance prefix of c is the substring t c . . . t c+i−1 , and the entrance fingerprint is φ(t c . . . t c+i−1 ). By definition, since c ∈ C(I, α), the entrance prefix of c matches p 0 . . . p i−1 (which may contain wildcards). Recall that a candidate c is inserted into Q I together with φ(t 0 . . . t c−1 ), which we call the candidate fingerprint of c.
Satellite information. The implementation associates each candidate c with satellite information (SI), which includes the candidate fingerprint and the entrance fingerprint of the candidate. The SI of a candidate combined with the sliding property of fingerprints are crucial for the implementation of the queue. When c is added to Q I , for some I = [i, j], we compute the entrance fingerprint of c from the candidate fingerprint and from φ(t 0 . . . t c+i−1 ) which is the text fingerprint at that time. When c is removed from Q I , we compute φ(t 0 . . . t c+i−1 ) in constant time from the SI of c. See Figure 7.
Arithmetic progressions and entrance prefixes. In order to implement the queue using a small amount of space, we distinguish between two types of candidates for each interval I = [i, j] ∈ I. The first type are candidates that share a specific entrance prefix, u I , which is defined solely by p 0 . . . p i−1 and is chosen such that if there are more than two candidates in C(I, α) with the same entrance prefix then this entrance prefix must be u I (see Lemma 11). In Lemma 12 we prove that all the candidates in C(I, α) that have entrance prefix u I , form an arithmetic progression. This leads to Lemma 13 where we show that all of theses candidates and their SI information can be stored implicitly using O(1) words of space. The second type of candidates are the rest of the candidates, and these candidates are stored explicitly together with their SI information. We prove in Lemma 14 that the total number of such candidates is O(d log m), thereby obtaining our claimed space usage.
Lemma 11. Suppose I is a partitioning that satisfies the properties of Lemma 9. For a pattern interval I = [i, j] ∈ I, there exists a string u I such that for any text T and time α ≥ 0 the set C(I, α) does not contain three candidates with the same entrance prefix u = u I .
Proof. Let c 1 < c 2 < c 3 be three different candidates in C(I, α) with the same entrance prefix u. By property 3 of Lemma 9 there is a string v of length |I| = j − i + 1 containing only non-wildcard characters that is a substring of the length i prefix of P .
Let r be an arbitrary location of v in p 0 . . . p i−1 (since v could appear several times in the prefix). The three candidates imply that after a shift of r characters from the candidates' locations, there are three occurrences of v in the text. These occurrences are within a substring of the text of length at most 2|v|, since all three candidates are in C(I, α) and so the distance between the first and last occurrence is at most |I| − 1 = |v| − 1 (the 2 factor accommodates the full occurrence of the third v). Thus, by Lemma 4, v must be periodic, and |v| ≥ 2ρ v .
Since c 1 , c 2 , and c 3 are all occurrences of u then c 3 − c 2 and c 2 − c 1 are period lengths of u. Thus, Therefore, by Lemma 5, ρ u = ρ v . Similarly, let α > α and suppose there are three candidates c 4 , c 5 , c 6 in C(I, α ). Notice that it is possible that c 1 , c 2 and c 3 are not in C(I, α ) since it is possible that enough time has passed for them to leave. Suppose c 4 , c 5 and c 6 share the same entrance prefix u . Then ρ u = ρ v = ρ u .
Assume by contradiction that u = u. Notice that the only possible locations of mismatches between u and u are the positions of wildcards in the i length prefix of P , since both u and u match this prefix. In particular, v occurs in the r'th location of both u and u . Let k be an index of a mismatch between u and u . In particular, let the k'th character of u be x, and the k'th character of u be x = x. Let γ be an integer (possibly negative) such that the k + γ · ρ v location in u is within the occurrence of v in u (and so also within the occurrence of v in u ). Notice that such a γ must exist since |v| ≥ 2ρ v . Since ρ u = ρ v = ρ u , the character at location k + γ · ρ v in u must be x, while the character at location k + γ · ρ v in u must be x . But u and u match at all of the locations corresponding to v. Thus we have obtained a contradiction, and so u = u is unique, as required.
Lemma 12. Suppose I is a partitioning that satisfies the properties by Lemma 9. For a pattern interval I = [i, j] ∈ I and time α ≥ 0 if there are h ≥ 3 candidates c 1 < c 2 < · · · < c h in C(I, α) that have u I as their entrance prefix, then the sequence c 1 , c 2 , . . . , c h forms an arithmetic progression whose difference is ρ u I .
Proof. The distance between any two candidates in C(I, α) is at most |I|, and |I| ≤ i by Property 3 of Lemma 9. Hence, by Lemma 3, all of the occurrences of u I in T that begin in text interval(I, α) form an arithmetic progression with difference ρ u I . Each of these occurrences matches the i length prefix of P , and therefore is a candidate in C(I, α). Hence, all the candidates of C(I, α) with u I as their entrance prefix form an arithmetic progression with difference of ρ u I . Implementation details. For any pattern interval I = [i, j] and time α we split the set of candidates C(I, α) into two disjoint sets. The set C ap (I, α) = {c ∈ C(I, α) | t c . . . t c+i−1 = u I } contains all the candidates whose entrance prefix is u I , and the set C ap (I, α) = C(I, α) \ C ap (I, α) contains all the other candidates of C(I, α). We use a linked list L Q I to store all of the candidates of C ap (I, α) together with their SI. Adding and removing a candidate that belongs in L Q I together with its SI is straightforward. The candidates of C ap (I, α) are maintained using a separate data structure that leverages Lemmas 11 and 12. Thus, during a Dequeue() operation, the queue verifies if the candidate to be returned is in L Q I or in the separate data structure for the C ap (I, α) candidates. Finally, for each pattern interval I the data structure stores the fingerprint of the the principle period of u I . Lemma 13. There exists an implementation of candidate-fingerprint-queues such that the queue Q I at time α > 0 maintains all the candidates of C ap (I, α) and their SI using O(1) words of space.
Proof. If |C ap (I, α)| ≤ 2 then Q I stores the candidates of C ap (I, α) explicitly in O(1) words of space. Otherwise, by Lemma 12, all the candidates of C ap (I, α) form an arithmetic progression. An arithmetic progression of arbitrary length can be represented using O(1) words of space. However, Q I also needs access to the SI for the candidates in this progression. To do this, Q I explicitly stores the first candidate (min C ap (I, α)) together with its SI, the common difference of the progression (ρ u I ), the length of the current progression, and the fingerprint of the principle period of u I . When a new candidate c with entrance fingerprint φ(u I ) enters Q I , c becomes the largest element in C ap (I, α), and so we first increment the length of the arithmetic progression, and if c is currently the only candidate in the arithmetic progression, then Q I stores c and its SI (since then c is the first candidate in the progression). When a Dequeue() operation needs to remove the first candidate c in the progression, then Q I removes c, which is stored explicitly together with its SI, decrements the length of the progression, and if there are remaining candidates in the progression then Q I computes the information for the new first remaining candidate in order to store its information explicitly. To do this, Q I first computes the location of the new first candidate from ρ u I and the location of c. The SI of the new first candidate is computed in constant time (via the sliding property) from the fingerprint of the principle period of u I and the candidate fingerprint of c. We focus on intervals for which C ap (I, α) ≥ 3, since if C ap (I, α) ≤ 2 the bound is straightforward.
Let [i * , j * ] be the leftmost interval in I . By definition of I , we have j * − i * + 1 = , and so by Property 3 of Lemma 9, there exists a string v of length containing only non-wildcard characters that is a substring of the length i * prefix of P . Let r be an arbitrary location of v in p 0 . . . p i * −1 (since v could appear several times in the prefix). For any [i , j ] = I ∈ I the entrance prefix (which does not contain wildcards) of each candidate in C(I , α) matches the i prefix of P (which can contain wildcards), and in particular, the location which is r locations to the left of any candidate in C(I , α) is a location of an occurrence of v in the text 6 .
Since we focus on intervals I ∈ I for which |C ap (I, α)| ≥ 3, then there exist three occurrences of v in the text in positions corresponding to a shift of r characters from locations of I's candidates. These occurrences are within a substring of the text of length at most 2|v|, since all three candidates are in C(I, α) and so the distance between the first and the last candidates is at most |I| − 1 ≤ − 1 = |v| − 1. Thus, by Lemma 4, v must be periodic, and the distance between any two candidates in C(I, α) must be a multiple of ρ v . Letĉ = max I∈I C(I, α) be the rightmost (largest index) candidate in the intervals corresponding to pattern intervals in I . Sinceĉ is a candidate in some C(I , α) for I ∈ I , then there is an occurrence of v at locationĉ + r. Thus, tĉ +r . . . tĉ +r+ −1 = v. We extend this occurrence of v to the left and to the right in T for as long as the length of the period does not increase. Let the resulting substring be t L+1 . . Proof. Let c be a candidate in C(I, α) for By definition, since I ∈ I , we have that |I| ≤ and so e c ≥ α − + 1. Since tĉ +r . . . tĉ +r+ −1 = v, it must be thatĉ + r + − 1 ≤ α. Thus,ĉ + r ≤ α − + 1 ≤ e c . By the maximality ofĉ, it is obvious that c ≤ĉ ≤ĉ + r. Hence, we have that c ≤ĉ + r ≤ e c . Proof. For c ∈ C ap (I, α) let u = u 0 . . . u i−1 be the entrance prefix of c. Recall that L <ĉ + r < R. By Claim 1 it must be that c ≤ĉ + r ≤ e c and so we cannot have both L, R < c or both L, R > e c .
Assume by contradiction that L < c ≤ e c < R. We claim that there exists a text input T such that if we execute the algorithm with T as the text, then there exists some time β where C(I, β) contains three candidates with u as their entrance prefix. Then, by Lemma 11 we deduce that u = u I , in contradiction to the definition of C ap (I, α).
Recall that the principle period length of t L+1 . . . t R−1 is ρ v . Since u = t c . . . t ec is a substring of t L+1 . . . t R−1 , it must be that ρ u ≤ ρ v . Recall that C ap (I, α) contains at least three candidates. Let c 1 , c 2 , and c 3 be three distinct candidates in C ap (I, α). Since c 1 , c 2 , and c 3 are all occurrences of u then c 3 − c 2 and c 2 − c 1 are period lengths of u. Thus, Therefore, by Lemma 5, ρ u = ρ v . Thus, ρ u ≤ j−i 2 implying that i + 2ρ u − j ≤ 0. Consider a long enough (at least i + 2ρ u − 1) text T which is composed of repeated concatenation of u 0 . . . u ρu−1 . Notice that the substrings of T of length i starting at locations 0, ρ u and 2ρ u are all exactly the string u, which matches p 0 . . . p i−1 . Consider an execution of the algorithm with T as the input text, and at time β = i + 2ρ u − 1 consider the set C(I, β). We have that text interval(I, β) Being that i + 2ρ u − j ≤ 0 then the interval [0, 2ρ u ] is a subinterval of text interval(I, β), then 0, ρ u and 2ρ u are all within this interval. Thus, these locations are candidates in C(I, β) with u as their entrance prefix. Thus, by Lemma 11, it must be that u = u I , which contradicts c ∈ C ap (I, α).
Let C left ap (I, α) be the set of candidates in C ap (I, α) whose entrance interval contains L, and let C right ap (I, α) be the set of candidates in C ap (I, α) whose entrance interval contains R. C left ap (I, α) and C right ap (I, α) are not necessarily disjoint. Notice that by Claim 2, C left ap (I, α) ∪ C right ap (I, α) contains all the candidates of C ap (I, α). Proof. Let I ∈ I and let ≈ denote the match relation between symbols in Σ ∪ {?}.
Notice that the contribution to I∈I C left ap (I, α) from all sets C left ap (I, α) that have less than two candidates is at most O(|I |). Thus, we will prove that for any set C left ap (I, α) with at least two candidates, it must be that for any candidate c ∈ C left ap (I, α), except for possibly one candidate, we have that p L−c is a wildcard.

The Algorithm of Theorem 2
The algorithm of Theorem 1 for PMDW usesÕ(d) time per character andÕ(d) words of space. In this section we introduce the algorithm of Theorem 2 which extends this result for a parameter 0 ≤ δ ≤ 1 to an algorithm that usesÕ(d 1−δ ) time per character andÕ(d 1+δ ) words of space.
An overview of a slightly modified version (for the sake of intuition) of the tradeoff algorithm is described as follows. Let P * be the longest prefix of P such that π P * ≤ d δ . The tradeoff algorithm first finds all the occurrences of P * in T using a specialized algorithm for patterns with bounded wildcard-period length. If P * = P then this completes the tradeoff algorithm. ? P 5,1 P 5,0 P 5,1 P 5,2 (c) The column pattern Pq Figure 9: Example of the matrix representation for pattern P = abcab?abcabcabcabcabc and q = 5. Each color represents a unique offset pattern. The offset patterns P 5,1 and P 5,4 are equal and therefore they have the same id (column color). Since P 5,3 contains a wildcard, it is not associated with any id. define in Lemma 9 are still satisfied. Let I * = [i * = |P * |, j * ] be the interval immediately following [i, |P * | − 1]. Each occurrence of P * in the text is inserted into the algorithm of Theorem 1 as a candidate directly into Q I * . Thus, the entrance prefixes of candidates in the queues match prefixes of P that are longer than P * and, by maximality of P * , these prefixes of P have large wildcard-period length. This implies that the average distance between two consecutive candidates that are occurrences of P * is at least d δ , and so, combined with a carefully designed scheduling approach for verifying candidates, we are able to obtain anÕ(d 1−δ ) amortized time cost per character.
Overview. In Section 6.1 we describe the specialized algorithm for dealing with patterns whose wildcard-period length is at most τ , for some parameter τ > 1. In Section 6.2 we complete the proof of Theorem 2 by describing the missing details for the tradeoff algorithm. In particular, the proof of Theorem 2 uses the algorithm of Section 6.1 with τ = d δ .

Patterns with Small Wildcard-period Length
Let P be a pattern of length m with d wildcards such that π P < τ . Let q be an integer, which for simplicity is assumed to divide m (see Appendix A.1 where we discuss how to get rid of this assumption). Consider the conceptual matrix M q = {m q x,y } of size m q × q where m q x,y = p (x−1)·q+y−1 . An example is given in Figure 9. For any integer 0 ≤ r < q the r'th column of M q corresponds to an offset pattern P q,r = p r p r+q p r+2q . . . p m−q+r . Notice that some offset patterns might be equal. Let Γ q = {P q,r | 0 ≤ r < q and ? / ∈ P q,r } be the set of all the offset patterns that do not contain any wildcards. Each offset pattern in Γ q is given a unique id. The set of unique ids is denoted by ID q . We say that index i in P is covered by q if the column containing p i does not contain a wildcard, and so P q,i mod q ∈ Γ q . The columns of M q define a column pattern P q of length q, where the j'th character is the id of the P q,j column, or ? if P q,j / ∈ Γ q (since P q,j contains wildcards). We partition T into q offset texts, where for every 0 ≤ r < q we define T q,r = t r t r+q t r+2q . . . . Using the dictionary matching streaming (DMS) algorithm of Clifford et al. [14] we look for occurrences of offset patterns from Γ q in each of the offset texts. We emphasize that we do not only find occurrences of P q,r in T q,r , since we cannot guarantee that the offset of T synchronizes with an occurrence of P . When the character t α arrives, the algorithm passes t α to the DMS algorithm for T q,α mod q . We also create a streaming column text T q whose characters correspond to the ids of offset patterns as follows. If one of the offset patterns is found in T q,α mod q , then its id is the α'th character in T q . Otherwise, we use a dummy character for the α'th character in T q .
Full cover. Notice that an occurrence of P in T necessarily creates an occurrence of P q in T q . Such occurrences are found via the black box algorithm of Clifford et al. [11]. However, an occurrence of P q in T q does not necessarily mean there was an occurrence of P in T , since some characters in P are not covered by q. In order to avoid such false positives we run the process in parallel with several choices of q, while guaranteeing that each non wildcard character in P is covered by at least one of those choices. Thus, if there is an occurrence of P q at location i in T q for all the choices of q, then it must be that P appears in T at location i. The choices of q are given by the following lemma.
Lemma 15. There exists a set Q of O(log d) prime numbers such that any index of a nonwildcard character in P is covered by at least one prime number q ∈ Q, and each number in Q is at mostÕ(d).
Proof. The proof uses the probabilistic method: we show that the probability that the set Q exists is strictly larger than 0. Since our proof is constructive it provides a randomized construction of Q.
It is well known that for a prime number q, every integer 0 ≤ z < q defines a congruence class which contains all integers i such that i mod q = z. For any two distinct natural numbers x, y ∈ N, let D x,y be the set of prime numbers q such that x and y are in the same congruence class modulo q (i.e. x mod q = y mod q). Notice that in the interpretation of the pattern columns in the conceptual matrix, if q ∈ D x,y then p x and p y are in the same column of the conceptual matrix M q . Recall that W is the set of occurrences of wildcards in P . Thus, if 0 ≤ j < m is an index such that j / ∈ W and if w ∈ W such that q ∈ D j,w , then j is surely not covered by q. By the Chinese remainder theorem, |D j,w | < log m (otherwise for γ = q∈D j,w q > q∈D j,w 2 ≥ m, and so j mod γ = w mod γ implying that j = w).
For any 0 ≤ j < m such that j / ∈ W , let D j = w∈W D j,w , so |D j | ≤ w∈W |D j,w | < |W | log m = d log m. If 2d ≤ m log 2 m then the proof is trivialized by choosing Q to contain only the smallest prime number which is at least m. If 2d > m log 2 m , by Corollary 1 in [3], then there are at least 2d log m prime numbers whose value are upper bounded by 2d log 2 m. LetQ be the set of those prime numbers. For a random q ∈Q, the probability that a specific non-wildcard pattern index j is not covered by q is at most 2d log m = 1 2 . Let Q be a set of 2 log m randomly chosen prime numbers fromQ. The probability that a specific non-wildcard pattern index j is not covered by any of the prime numbers in Q is less than 1 2 2 log m ≤ 1 m 2 . Thus, the probability that there exists a non-wildcard pattern index j which is not covered by any of the prime numbers in Q is less than m−d m 2 ≤ 1 m . Therefore, there must exist a set Q that covers all of the indices of non-wildcard characters from P .
From a space usage perspective, we need the size of |Γ q | to be small, since this directly affects the space usage of the DMS algorithm which usesÕ(k) space, where k is the number of patterns in the dictionary. In our case k = |Γ q |. In order to bound the size of Γ q we use the following lemma.
For each id in ID q we pick an index of a representative column in M q that has this id, and denote this set by R q . Let r 1 be the minimum index in R q . For every index 0 ≤ i < m let S i = s i . . . s i+m−1 (see Figure 10). For every 0 ≤ r < q let S i,q,r = s i+r s i+r+q . . . s i+m−q+r , and so for any integer 0 ≤ ∆ < q − r we have S i,q,r+∆ = S i+∆,q,r . Notice that if S i matches P then P q,r = S i,q,r for each r ∈ R q .
Let i be an index of an occurrence of P in S. For any distinct r, r ∈ R q , it must be that S i,q,r = P q,r = P q,r = S i,q,r . In particular, for any r ∈ R q such that r > r 1 , we have P q,r 1 = S i,q,r 1 = S i,q,r = S i+r−r 1 ,q,r 1 . This implies that i + r − r 1 cannot be an occurrence of P . Hence, every occurrence of P in S eliminates |R q | − 1 locations in S from being an occurrence of P . We now show that the sets of eliminated locations defined by distinct occurrences are disjoint. Assume without loss of generality that S contains at least two occurrences. Let i 1 and i 2 be two distinct occurrences of P in S, and assume by contradiction that an index j is eliminated by both of these occurrences. Since s i 1 . . . s i 1 +m−1 matches P , we have that S i 1 ,q,j−i 1 = P j−i 1 and j − i 1 ∈ R q . Similarly, we have that S i 2 ,q,j−i 2 = P j−i 2 and j − i 2 ∈ R q . Being that S i 1 ,q,j−i 1 = S i 2 ,q,j−i 2 we have that P j−i 2 = P j−i 1 , contradicting the definition of R q . Therefore, the maximum number of occurrences of P in S is at most |S| |Rq| = 2m−1 |Rq| . Since S contains at least m τ instances of P , it must be that m τ ≤ 2m−1 |Rq| which implies that |Γ q | = |R q | ≤ 2τ = O(τ ).
Complexities. For a single q ∈ Q, the algorithm creates q =Õ(d) offset patterns and texts. For each such offset text the algorithm applies an instance of the DMS algorithm with a dictionary of O(τ ) strings (by Lemma 16). Since each instance of the DMS algorithm usesÕ(τ ) words of space [14], the total space usage for all instances of the DMS algorithm isÕ(dτ ) words. Moreover, the time per character in each DMS algorithm isÕ(1) time, and each time a character appears we inject it into only one of the DMS algorithms (for this specific q). In addition, the algorithm uses an instance of the black box algorithm for T q , with a pattern of length q. This uses another O(q) =Õ(d) space and anotherÕ(1) time per character [11]. Thus the total space usage due to one element in Q isÕ(dτ ) words. Since |Q| = O(log d) the total space usage for all elements in Q isÕ(dτ ) words, and the total time per arriving character isÕ(1). Thus we have proven the following.
Theorem 17. For any τ ≥ 1, there exists a randomized Monte Carlo algorithm for PMDW on patterns P with π P < τ in the streaming model, which succeeds with probability 1 − 1/poly(n), usesÕ(dτ ) words of space and spendsÕ(1) time per arriving text character.

Proof of Theorem 2
In this section we combine the algorithm of Theorem 1 with the algorithm of Theorem 17 and introduce an algorithm for patterns with general wildcard-period length, thereby proving Theorem 2.
Prior to Section 6.1 we presented an almost accurate description of the algorithm. The only two parts of the description that require elaboration are regarding how to insert occurrences of P * into the appropriate candidate-fingerprint-queue efficiently, and how to schedule validations of candidates so that the amortized cost is low. We first focus on how to insert candidates and later we discuss the scheduling.
Direct insertion of candidates. The challenge with inserting occurrences of P * into Q I * is that the candidate-fingerprint-queue data structure uses the SI of candidates, and so the straightforward ways for providing this information together with the new candidates (which are occurrences of P * ) cost either too much time or too much space. In order to meet our desired complexities, we first investigate the purposes of different parts of SI.
The SI for a candidate c in C(I = [i, j], α) consists of the candidate fingerprint, φ(t 0 . . . t c−1 ), and the entrance fingerprint, φ(t c . . . t c+i−1 ). The SI has two purposes. The first is to validate a candidate after a Dequeue() operation, in which case the algorithm makes use of both parts of the SI in order to compute φ(t c+i . . . t c+j ) by combining the SI with the text fingerprint. The second purpose is to compute the next entrance fingerprints of candidates in order to distinguish between candidates that are stored as part of an arithmetic progression and candidates that are not. The entrance fingerprint is obtained, via the sliding property, from the candidate fingerprint in the SI and the current text fingerprint.
Notice that in order to validate c the algorithm only needs the fingerprint of φ(t 0 . . . t c−i+1 ). Also notice that entrance prefixes are only used for candidates that are at some point part of a stored arithmetic progression. Thus, for a specially chosen subset of strings Ψ ⊆ Σ |P * | we precompute all of the fingerprints of strings in Ψ. The set Ψ is chosen so that for any occurrence of P * that is injected as a candidate c where c is at some point part of a stored arithmetic progression, the occurrence of P * at location c is in Ψ. We use the DMS algorithm [14] to locate strings from Ψ in the text, and whenever such a string appears, we compute the SI for the corresponding candidate in constant time from the stored fingerprint and the current text fingerprint. We emphasize that not all of the candidates that correspond to strings in Ψ need to necessarily at some point be a part of an arithmetic progression. However, in order to reduce the space usage, we require that Ψ is not too large, and in particular |Ψ| = O(d + log m). For a candidate c that does not correspond to a string in Ψ, instead of maintain the SI of c, we explicitly maintain the fingerprint of φ(t 0 . . . t c−i+1 ) where c ∈ C(I = [i, j], α). Notice that whenever such a candidate enters a new text interval, the text fingerprint at that time is exactly the information which we need to store.
Creating Ψ. Consider all pattern intervals I = [i, j] ∈ I with i ≥ i * . Notice that there are at most O(d + log m) such pattern intervals. For each such interval I, let ψ I be the prefix of u I of length |P * |. Since, by Lemma 11, a candidate c ∈ C ap (I, α) implies an occurrence of u I at location c, then ψ I also appears at location c. Thus, we define Ψ to be the set containing ψ I for all such pattern intervals I. Since any candidate in an arithmetic progression at time α must be in C ap (I, α) for some interval I, it is guaranteed that when c corresponded to an occurrence of P * , that occurrence must have been ψ I , and so Ψ has the required properties.
Scheduling validations. Since the only bound we have proven on the number of pattern intervals I = [i, j] ∈ I with i ≥ i * is O(d + log m), if each time a new text character arrives we perform a Dequeue() operation for each one of the pattern intervals, then the time cost can be as large as O(d + log m) which is too much. The solution for reducing this time cost is to only perform a Dequeue() operation on Q I when a candidate c actually leaves text interval(I, α) and needs to be validated. This is implemented by maintaining a priority queue on top of the pattern intervals, where the keys that are used are the next time a candidate exits the corresponding text interval. Each time a candidate leaves a text interval, the key for the queue of that interval is updated to the time the next candidate leaves (if such a candidate exists). When a candidate entering a text interval is the only candidate of that text interval, then the key for the queue of this text interval is also updated.
Complexities. Recall that I * = [i * , j * ] is a pattern interval such that i * = |P * |, and that each time the algorithm finds an occurrence of P * , the corresponding candidate is inserted into Q I * . Let P be the prefix of P of length j * + 1. By maximality of P * , it must be that π P > d δ . We partition the time usage of the algorithm into three parts. The first is the amount of time spent on finding occurrences of P * using the algorithm of Theorem 17, which isÕ(1). The second is the amount of time spent performing Enqueue() and Dequeue() operations on Q I * , which is alsõ O(1) since we perform O(1) operations on this queue per each arriving character. The third is the amount of time spent on Enqueue() and Dequeue() operations on Q I for I = [i, j] with i > j * . These operations only apply to candidates that are occurrences of P . For this part we use amortized analysis.
By definition of wildcard-period length, for any string S of size 2|P | − 1, we have d δ < π P ≤ |P | occ(S,P ) . Being that occ(S, P ) ≤ |P |, we have d δ < 2|P | occ(S,P ) . Notice that for a text T of size n ≥ |P |, we must have occ(T, P ) < 2n d δ . This is because otherwise, if n ≥ 2|P | − 1 then there exists a substring of n of length 2|P | − 1 with at least 2|P | d δ occurrences of P , and if n < 2|P | − 1 then we can pad T to create such a string. In both cases we contradict d δ < 2|P | occ(S,P ) for any string S of length 2|P | − 1.
The total amount of time spent on each occurrence of P isÕ(d), and so the total cost for processing T on candidates that are also occurrences of P is at mostÕ(occ(T, P ) · d) = O( 2n d δ d) = O(n · d 1−δ ). Thus, the amortized cost per character isÕ(d 1−δ ). For the space complexity, the most expensive part is the use of the algorithm of Theorem 17 which takes O(d · d δ ) = O(d 1+δ ) words of space. This completes the proof of Theorem 2.