Give Me Some Slack : Efficient Network Measurements

Many networking applications require timely access to recent network measurements, which can be captured using a sliding window model. Maintaining such measurements is a challenging task due to the fast line speed and scarcity of fast memory in routers. In this work, we study the impact of allowing slack in the window size on the asymptotic requirements of sliding window problems. That is, the algorithm can dynamically adjust the window size betweenW andW (1+τ) where τ is a small positive parameter. We demonstrate this model’s attractiveness by showing that it enables efficient algorithms to problems such as Maximum and General-Summing that require Ω(W ) bits even for constant factor approximations in the exact sliding window model. Additionally, for problems that admit sub-linear approximation algorithms such as Basic-Summing and CountDistinct, the slack model enables a further asymptotic improvement. The main focus of the paper is on the widely studied Basic-Summing problem of computing the sum of the last W integers from {0, 1 . . . , R} in a stream. While it is known that Ω(W logR) bits are needed in the exact window model, we show that approximate windows allow an exponential space reduction for constant τ . Specifically, for τ = Θ(1), we present a space lower bound of Ω(log(RW )) bits. Additionally, we show an Ω(log (W/ )) lower bound for RW additive approximations and a Ω(log (W/ ) + log logR) bits lower bound for (1 + ) multiplicative approximations. Our work is the first to study this problem in the exact and additive approximation settings. For all settings, we provide memory optimal algorithms that operate in worst case constant time. This strictly improves on the work of [14] for (1 + )-multiplicative approximation that requires O( −1 log (RW ) log log (RW )) space and performs updates in O(log (RW )) worst case time. Finally, we show asymptotic improvements for the Count-Distinct, General-Summing and Maximum problems. 2012 ACM Subject Classification Theory of computation → Streaming, sublinear and near linear time algorithms


Introduction
Network algorithms in diverse areas such as traffic engineering, load balancing and quality of service [2,9,21,24,31] rely on timely link measurements.In such applications recent data is often more relevant than older data, motivating the notions of aging and sliding window [6,11,15,25,27].For example, a sudden decrease in the average packet size on a link may indicate a SYN attack [26].Additionally, a load balancer may benefit from knowing the current utilization of a link to avoid congestion [2].
While conceptually simple, conveying the necessary information to network algorithms is a difficult challenge due to current memory technology limitations.Specifically, DRAM memory is abundant but too slow to cope with the line rate while SRAM memory is fast enough but has a limited capacity [10,13,29].Online decisions are therefore realized through space efficient data structures [7,8,16,17,4,23,28,30] that store measurement statistics in a concise manner.For example, [16,28] utilize probabilistic counters that only require O(log log N ) bits to approximately represent numbers up to N .Others conserve space using variable sized counter encoding [17,23] and monitoring only the frequent elements [6].
Basic-Summing is one of the most basic textbook examples of such approximated sliding window stream processing problems [14].In this problem, one is required to keep track of the sum of the last W elements, when all elements are non-negative integers in the range {0, 1, . . ., R}.The work in [14] provides a (1+ )-multiplicative approximation of this problem using O 1 • log 2 W + log R • (log W + log log R) bits.The amortized time complexity is O( log R log W ) and the worst case is O(log W + log R).In contrast, we previously showed an RW -additive approximation with Θ 1 + log W bits [3].
Sliding window counters (approximated or accurate) require asymptotically more space than plain stream counters.Such window counters are prohibitively large for networking devices which already optimize the space consumption of plain counters.This paper explores the concept of slack, or approximated sliding window, bridging this gap.Figure 1 illustrates a "window" in this model.Here, each query may select a τ -slack window whose size is between W (the green elements) and W (1 + τ ) (the green plus yellow elements).The goal is to compute the sum with respect to this chosen window.Slack windows were also considered in previous works [14,27] and we call the problem of maintaining the sum over a slack window Slack Summing.Datar et al. [14] showed that constant slack reduces the required memory from O( 1 • log 2 W + log R • (log W + log log R) ) to O( −1 log(RW ) log log(RW )).For τ -slack windows they provide a (1 + )-multiplicative approximation using O( −1 log(RW )(log log(RW ) + log τ −1 )) bits.

Our Contributions
This paper studies the space and time complexity reductions that can be attained by allowing slack -an error in the window size.Our results demonstrate exponentially smaller and asymptotically faster data structures compared to various problems over exact windows.We start with deriving lower bounds for three variants of the Basic-Summing problem -when computing an exact sum over a slack window, or when combined with an additive and a multiplicative error in the sum.We present algorithms that are based on dividing the stream into W τ -sized blocks.Our algorithms sum the elements within each block and represent each block's sum in a cyclic array of size τ −1 .We use multiple compression techniques during different stages to drive down the space complexity.The resulting algorithms are space optimal, substantially simpler than previous work, and reduce update time to O(1).
Next, we introduce algorithms for the Slack Summing problem, which asymptotically reduce the required memory compared to the sliding window model.For the exact and additive error versions of the problem, we provide memory optimal algorithms.In the multiplicative error setting, we provide an O τ −1 log −1 + log log (RW τ ) + log(RW ) space algorithm.This is asymptotically optimal when τ = Ω(log −1 W ) and R = poly(W ).It also asymptotically improves [14] when τ −1 = o( −1 log (RW )).We further provide an asymptotically optimal solution for constant τ , even when R = W ω (1) .All our algorithms are deterministic and operate in worst case constant time.In contrast, the algorithm of [14] works in O(log RW ) worst case time.
To exemplify our results, consider monitoring the average bandwidth (in bytes per second) passed through a router in a 24 hours window, i.e., W 86400 seconds.Assuming we use a 100GbE fiber transceiver, our stream values are bounded by R ≈ 2 34 bytes.If we are willing to withstand an error of = 2 −20 (i.e., about 16KBps), the work of [3] provides an additive approximation over the sliding window and requires about 120KB.In contrast, using a 10 minutes slack (τ 1 144 ), our algorithm for exact Slack Summing requires only 800 bytes, 99% less than approximate summing over exact sliding window.For the same slack size, the algorithm of [14] requires more space than our exact algorithm even for a large 3% error.Further, if we also allow the same additive error ( = 2 −20 ), we provide an algorithm that requires only 240 bytes -a reduction of more than 99.8% !Table 1 compares our results for the important case of constant slack with [14].As depicted, our exact algorithm is faster and more space efficient than the multiplicative approximation of [14].Comparing our multiplicative approximation algorithm to that of [14], we present exponential space reductions in the dependencies on −1 and R, with an asymptotic reduction in W as well.We also improve the update time from O(log (RW )) to O (1).

34:4
Give Me Some Slack: Efficient Network Measurements Finally, we apply the slack window approach to multiple streaming problems, including Maximum, General-Summing, Count-Distinct and Standard-Deviation.We show that, while some of these problems cannot be approximated on an exact window in sub-linear space (e.g.maximum and general sum), we can easily do so for slack windows.In the count distinct problem, a constant slack yields an asymptotic space reduction over [11,19].

Lower Bounds
In this section, we analyze the space required for solving the Slack Summing problems.Intuitively, our bounds are derived by constructing a set of inputs that any algorithm must distinguish to meet the required guarantees.There are two tricks that we frequently use in these lower bounds.The first is setting the input such that the slack consists only of zeros, and thus the algorithm must return the desired approximation of the remaining window.The next is using a "cycle argument" -consider two inputs x and x • y for x, y ∈ {0, 1, . . ., R} * .
If both lead to the same memory configuration, so do such xy k for any k ∈ N. Thus, if there is a k such that no single answer approximates x and xy k well, then x and xy had to lead to separate memory configurations in the first place.

(W, τ )-Exact Summing
We start by proving lower bounds on the memory required for exact Slack Summing.

Proof. Consider the following language
That is, L E1 contains a word with W + W τ consecutive zeros and the rest of the words in L E1 are composed of these components in this order: Our lower bound stems from the observation that every word in L E1 must lead to a different state.The language size is: Therefore, the number of required bits is at least: log |L E1 | > log(RW 2 ) − 1 .Further, this number is an integer and therefore at least log(RW 2 ) bits are required.
First, notice that the word composed of W + W τ zeros requires a unique configuration as A must return 0 after processing that word.In contrast, it must not return 0 after processing any other word as there is at least a single R within the last W elements.
Let w 1 , w 2 ∈ L E1 be two different words that are not all-zeros.We need to show that w 1 and w 2 require different memory configuration.
By definition of L E1 , respectively and that both are preceded with at least W τ zeros.If and thus A cannot return the same count for both, regardless of the slack, as it is all zeros ib both w 1 and w 2 .
Next, assume that i 1 = i 2 , σ 1 = σ 2 and that without loss of generality j 1 < j 2 .This means that both w 1 and w 2 have the same count. Since Assume by contradiction that after processing w 1 , w 2 A reaches the same memory configuration.Since A is deterministic, this means that it must reach the same configuration after seeing w 1 • 0 z(j2−j1) for any integer z.By choosing z = W (1 + τ ), we get that the algorithm reaches this configuration once again while the entire window consists of zeros.This is a contradiction since σ 1 , σ 2 = 0, and the algorithm cannot answer both w 1 and w 1 • 0 z(j2−j1) correctly.
We now use Lemma 1 to show the following lower bound on (W, τ )-Exact Summing algorithms: Theorem 2. Any deterministic algorithm A that solves the (W, τ )-Exact Summing problem must use at least max log RW 2 , τ −1 /2 log (RW τ + 1) bits.
Proof.Lemma 1 shows a log RW 2 bound.We proceed with showing a lower bound τ −1 /2 log (RW τ + 1) bits.Consider the following languages: since each of the words in L E2 has a distinct sum of literals, and each number in {0, 1, . . ., RW τ } is the sum of a word.We show that each input in respectively, and that the preceding W τ elements of both are all zeros.An illustration of the setting appears in Figure 2. By our choice of χ, we have that the sum of the last W elements of S * and S * 2 is different, and since the slack is all zeros, no answer is correct on both.Finally, note that this implies that S 1 , S 2 had to reach different configurations, as otherwise A would reach the same configuration after processing the additional 2W τ (χ − 1/2) zeros.
Before we prove Thorem 3, we start with a simpler lower bound.
First, notice that |L A1 | = 1/4 and that all words in L A1 have length of at most W/2.This means that We now show that every word in L A1 must have a dedicated memory configuration, thereby implying a log W/8 bits bound.Let then their most recent W elements differ by more than 2RW and there is no output that is correct for both.Note that the slack of both w 1 and w 2 is all zeros.Hence, w 1 and w 2 require different memory configurations.
Assume that x 1 = x 2 and that by contradiction both w 1 and w 2 reached the same memory configuration.Since w 1 = w 2 and x 1 = x 2 , then q 1 = q 2 and without loss of generality q 1 < q 2 .This implies that w 1 is a prefix of w 2 so that w 2 = w 1 • 0 q2−q1 .Thus, A enters the shared configuration after reading w 1 and revisits it after reading 0 q2−q1 .A is a deterministic algorithm and therefore it reaches the same configuration also for the following word: w 1 • 0 (W +W τ )(q2−q1) .In that word, the last W + W τ elements are all zeros while the sum of the last W elements in w 1 is at least 2RW .Hence, there is no return value that is correct for both w 1 and w 1 • 0 (W +W τ )(q2−q1) .
Proof.Lemma 4 shows that A must use at least log(W/ ) − O(1) bits.Given x ∈ [RW τ ], we denote by rep(x) whose sum is x.We consider the following languages: Our goal is to show that no two words in L A2 have the same memory configuration.
Next, consider the following sequences: Additionally, the W τ elements slack in both S * 1 and S * 2 are all zeros.Now, since the sum of w χ,1 and w χ,2 must differ by at least 2RW , no number can approximate both with less than RW error.

(W, τ, )-Multiplicative Summing
In this section, we show lower bounds for multiplicative approximations of Slack Summing.We start with Lemma 5, whose proof appears in the full version of this paper [5].
To extend our multiplicative lower bound, we use the following fact: x • c n−1 + y otherwise can be represented using a closed form as Using the fact above, we show the following lemma: Proof.To apply Fact 1, we define an upper bounding sequence {b i,k } ∞ i=1 as follows: Thus, we can rewrite the n'th element of the sequence as: We can now use this representation to derive an upper bound of b n,k : Finally, since a n,k ≤ b n,k for any n, k, we conclude that M F C S 2 0 1 8

34:8
Give Me Some Slack: Efficient Network Measurements We now define the integer set Proof.Clearly, the cardinality of I k is the largest n for which a n,k ≤ ψ k .According to Lemma 6, we have that a n,k ≤ 4 −1 (1 + ) n+1 ψ k−1 , and thus: We proceed with a stronger lower bound for non-constant τ values.
Lemma 8.For That is, every word in the L M,2 language consists of a concatenation of words w 1 , . . ., w τ −1 /2 , such that every w i starts with W τ zeros followed by a string representing an integer in I τ −1 /2 +1−i , which is defined above.According to Lemma 7 we have that Next, we show that every two words in L M,2 must reach different memory configurations, thereby implying a Ω log |L M,2 | bits lower bound.Let We next assume by contradiction that S 1 and S 2 leads A to the same memory configuration.Let χ ∈ 1, . . ., τ −1 /2 such that w χ,1 = w χ,2 .Since A reaches an identical configuration after reading S 1 , S 2 , and as it is deterministic, A must reach the same configuration when processing S 1 • 0 2W τ (χ−1/2) and S 2 • 0 2W τ (χ−1/2) .Next, observe that for every k ∈ {1, . . ., τ −1 /2 }, the representation length of any of its words is bounded by ψ k /R .Thus, the length of a word in Now, since every word w i,j starts with a sequence of W τ zeros, the slack size chosen by the algorithm is irrelevant and the sums the algorithm must estimate are and s(w i,2 ), where s(w i,j ) is simply the sum of the symbols in w i,j .Note that s(w χ,1 ) and s(w χ,2 ) are integers in I τ −1 /2 +1−χ .We assume without loss of generality that s(w χ,1 ) < s(w χ,2 ) (i.e., s(w χ,1 ) < s(w χ,2 ) ∈ I τ −1 /2 +1−χ ).Finally, it follows that where the last inequality follows from the definition of I τ −1 /2 +1−χ .Thus, no S value is correct for both Finally, we combine Lemma 5 and Lemma 8 to obtain the following lower bound:

Upper Bounds
In this section, we introduce solutions for the Slack Summing problems.In general, all our algorithms have a structure that consists of a subset of the following, where "rounding" has a different meaning for the exact, additive and multiplicative variants: Round the arriving item.
Add the item into a counter y and round the counter.If a W τ -sized block ends, store it as a compressed representation of y.Sometimes we propagate the compression error to the following block; otherwise, we zero y.
Use the block values and y to construct an estimation for the sum.A key idea in our additive and multiplicative algorithms is to introduce rounding errors but maintain the accountability trail so that they do not snowball and exceed the desired guarantees.In the additive algorithm, our double rounding technique asymptotically improves over running 1/τ separate plain stream (insertion only) algorithm instances.

(W, τ )-Exact Summing
We divide the stream into W τ -sized blocks and sum the number of arriving elements in each block with a log (RW τ + 1) bits counter.We maintain the sum of the current block in a variable called y, c maintains the number of elements within the current block, and i is the current block number.The variable b is a cyclic buffer of τ −1 blocks.Every W τ steps, we assign the value of y to the oldest block (b i ) and increment i. Intuitively, we "forget" b i when its block is no longer part of the window.To satisfy queries in constant time, we also maintain the sum of all active counters in a log (RW (1 + τ ) + 1) -bits variable named B. Algorithm 1 provides pseudocode for the described algorithm.We now analyze the memory consumption of Algorithm 1. Proof.y takes log (RW τ + 1) bits; B requires log (RW + 1) ; i adds log τ −1 bits, while c needs log W τ bits.Finally, b is a τ −1 -sized array of counters, each allocated with log (RW τ + 1) bits.Overall, it uses (τ −1 + 1) log (RW τ + 1) + log RW 2 + 4 bits.

M F C S
We conclude that Algorithm 1 is asymptotically optimal.
Theorem 11 shows that Algorithm 1 is only x4 larger than the lower bound.In the full version of this paper [5], we show that in some cases we can get considerably closer to the lower bound.
Finally, in the full version of this paper [5] we show that Algorithm 1 is correct.

(W, τ, )-Additive Summing
We now show that additional memory savings can be obtained by combining slackness with an additive error.First, we consider the case where τ ≤ 2 .In [3], we proposed an algorithm that sums over (exact) W elements window using the optimal Θ( −1 + log W ) bits, with an additive error of RW .Next, notice that if an algorithm solves (W, τ, )-Additive Summing, it also solves (W, τ, τ /2)-Additive Summing; hence, we can apply Theorem 3 to conclude that it requires Ω(τ −1 + log W ) = Ω( −1 + log W ). Thus, we can run the algorithm from [3] and remain asymptotically memory optimal with no slack at all! Henceforth, we assume that τ > 2 ; we present an algorithm for the problem using a 2-stage rounding technique.When a new item arrives, we scale it by R and then round the results to O(log −1 ) bits.As in Section 4.1, we break the stream into non-overlapping blocks of size W τ and compute the sum of each block separately.However, we now sum the rounded values rather than the exact input, with a O(log W τ )-bits counter denoted y.Once the block is completed, we round its sum such that it is represented with O(log τ ) bits.Note that this second rounding is done for the entire block's sum while we still have the "exact" sum of rounded fractions.Thus, we propagate the second rounding error to the following block.An illustration of our algorithm appears in Figure 3. Here, Round υ (z) refers to rounding a fractional number z ∈ [0, 1] into the closest number z such that 2 υ • z ∈ N. Algorithm 2 provides pseudo code for the algorithm, which uses the following variables: Algorithm 2 (W, τ, )-Additive Summing Algorithm.
Replace the value for the block that has left the window. 8: 1. y -a fixed point variable that uses log τ + 1 bits to store its integral part and additional υ 1 log −1 + 1 bits for storing the fractional part.2. b -a cyclic array that contains τ −1 elements, each of which takes υ 2 log τ bits.3. B -keeps the sum of elements in b and is represented using log τ −1 log τ + 1 bits.4. i -the index variable used for tracking the oldest block in b. 5. c -a variable that keeps the offset within the W τ sized block.
We now analyze the memory consumption of Algorithm 2.
Finally, Theorem 14 shows that Algorithm 2 is correct.The proof is deferred to the full version of this paper [5].

(W, τ, )-Multiplicative Summing
In this section, we present Algorithm 3 that provides a (1 + ) multiplicative approximation of the Slack Summing problem.Compared to Algorithm 1, we achieve a space reduction by representing each sum of W τ elements using O(log log (RW τ ) + log −1 ) bits.Specifically, when a block ends, if its sum was y, we store ρ = log (1+ /2) y (we allow a value of −∞ for ρ if y = 0).To achieve O(1) Output, we also store an approximate window sum B, which is now a fixed point fractional variable with O(log RW ) bits for its integral part and additional O(log −1 ) bits for storing a fraction.To update B's value for a new ρ, we round down the value of (1 + ) ρ .Specifically, for a real number x, we denote (x) ↓ x • k /k, for k 4 .Our pseudo code appears in Algorithm 3. The algorithm requires O τ −1 log log (RW τ ) + log −1 + log RW bits of space and is memory optimal when R = W O (1) and τ = Ω 1 log RW .The full analysis of Algorithm 3 is deferred to the full version of this paper [5].Next, we present an alternative (W, τ, )-Multiplicative Summing algorithm that achieves optimal space consumption for τ = Θ(1), regardless of the value of R.
Intuitively, we shave the Ω (log R) bits from the space requirement of Algorithm 3 using an approximate representation for our y variable and by not keeping the B variable that allowed O(1) time queries regardless of the value of τ .To avoid using Ω (log R) bits in y, we use a fixed point representation in which O(log −1 + log log (RW τ )) bits are allocated for its integral part and another O(log W τ ) for the fractional part.The goal of y is still to approximate the sum of the elements within a block, but now we aim for the sum to be approximately (1 + /3) y .Whenever a block ends, we store only the integral part of y in our cyclic array b to save space.When queried, we compute an estimate for the sum using all of the values in b, which makes our query procedure take O(log τ −1 ) time.To use the fixed point structure of y, we use the operator (•) ⇓ that rounds a real number x into (x) ⇓ x • W τ /W τ .We denote log In the full version of this paper [5], we prove the following theorem: Theorem 15.For τ = Θ(1), Algorithm 4 processes elements and answers queries in O(1) time, uses O(log(W/ ) + log log R) bits, and is asymptotically optimal.

The Mean of a Slack Window
For some applications there is value in knowing the mean of a slack window.For example, a load balancer may be interested in the transmission throughput.In exact windows, the sum and the mean can be derived from each other as the window size is constant.In slack windows, the window size changes but our algorithms also return the current slack offset 0 ≤ c < W τ .That is, by dividing S by W + c we get an estimation of the mean (we assume that stream size is larger than W).Specifically, Algorithm 1 provides the exact mean; Algorithm 2 approximates it with R additive error, while Algorithm 3 yields a (1 + ε) multiplicative approximation.

Other Measurements over Slack Windows
We now explore the benefits of the slack model for other problems.
Maximum.While maintaining the maximum of a sliding window can be useful for applications such as anomaly detection [26,21], tracking it over an exact window is often infeasible.Specifically, any algorithms for a maximum over an (exact) window must use Ω (W log (R/W )) bits [14].The following theorem shows that we can get a much more efficient algorithm for slack windows.The proof appears in the full version of this paper [5] Observe the the following bounds match for τ values that are not too small (τ = R Ω(1)−1 ).

Theorem 16.
Tracking the maximum over a slack window deterministically requires O τ −1 log R and Ω τ −1 log Rτ bits.

Standard-Deviation.
Building on the ability of our summing algorithms to provide the size of the slack window that they approximate, we can compute standard deviations over slack windows.Intuitively, the standard deviation of the window can be expressed as

34:14
Give Me Some Slack: Efficient Network Measurements there W is the slack window and m W is its mean.We can then use two slack summing instances to track x∈W x 2 and m W = |W | −1 x∈W x.This gives us an algorithm that computes the exact standard deviation over slack windows using O(τ −1 log (RW τ )) space.Similarly, by using approximate rather than exact summing solutions we can compute a (1 + ) multiplicative approximation for the standard deviation using O τ −1 log −1 + log log (RW τ ) + log W bits, or an R -additive approximation using O(τ −1 log τ + log W ) space.We expand on this further in the full version of this paper [5].
General-Summing.General-Summing is similar to Basic-Summing, except that the integers can be in the range {−R, . . ., R}.That is, we now allow for negative elements as well.Datar et al. [14] proved that General Sum requires Ω(W ) bits, even for R = 1 and constant factor approximation.In contrast, our exact summing algorithm from section 4.1 trivially generalizes to General-Summing and allows exact solution over slack windows.
Count-Distinct.Estimating the number of distinct elements in a stream is another useful metric.In networking, the packet header is used to identify different flows, and it is useful to know how many distinct of them are currently active.A sudden spike in the number of active flows is often an indication of a threat to the network.It may indicate the propagation of a worm or virus, port scans that are used to detect vulnerabilities in the system and even Distributed Denial of Service (DDoS) attacks [12,18,20].
Here, we have studied the memory reduction that can be obtained by following a similar flow to our summing algorithms -we break the stream into W τ sized blocks and run the state of the art approximation algorithm on each block separately.Luckily, count distinct algorithms are mergable [1].That is, we can merge the summaries for each block to obtain an estimation of the number of distinct items in the union of the blocks.In the full version of this paper [5], we show that this approach yields an algorithm with superior space and query time compared to the state of the art algorithms for counting distinct elements over sliding windows [11,19].Formally, we prove the following theorem.
Theorem 17.For τ = Θ(1) and any fixed m > 0, there exists an algorithm that uses O(m) space, performs updates in constant time and answers queries in time O(m), such that the result approximates a window whose size is in [W, W (1 + τ )]; the resulting estimation is asymptotically unbiased and has a standard deviation of σ = O( 1 √ m ).State of the art approaches for exact windows [11,19] require O(m log (W/m)) space and O(m log (W/m)) time per query for a similar standard deviation.

Discussion
In this work we have explored the slack window model for multiple streaming problems.
We have shown that it enables asymptotic space and time improvements.Particularly, introducing slack enables logarithmic space exact algorithms for certain problems such as Maximum and General-Summing.In contract, these problems do not admit sub-linear space approximations in the exact window model.Even in problems that do have sub-linear space approximations such as Standard-Deviation and Count-Distinct, adding slack asymptotically improves the space requirement and allows for constant time updates.Much of our work has focused on the classic Basic-Summing problem.Based on our findings, we argue that allowing a slack in the window size is an attractive approximation axis as it enables greater space reductions compared to an error in the sum.As an example, for a fixed value, computing a (1 + )-multiplicative approximation requires Ω(log (RW ) log W ) space [14].Conversely, a (1 + τ ) multiplicative error in the window size, for a constant τ , allows summing using Θ(log (RW )) bits -same as in summing W elements without sliding windows!Given that for exact windows randomized algorithms have the same asymptotic complexity as deterministic ones [3,14], we expect randomization to have limited benefits for slack windows as well.

Figure 3
Figure3An illustration of our 2-stage rounding technique.Arriving elements are rounded to log −1 + 1 bits.We then sum the rounded fractions of each block and round the resulting sum into log τ bits.The second rounding error is propagated to the next block.

Table 1
Comparison of Basic-Summing algorithms.Our contributions are in bold.All algorithms process elements in constant time except for the rightmost column where both update in O(log (RW )) time.We present matching lower bounds to all our algorithms.

. Exact algorithms: an
1 , x 2 , ..., x t , where at each step a new element x i ∈ [R] is added to S. A W -sized window contains only the last W elements:x t−W +1 ...x t .We say that F is a τ -slack W -sized window if there exists c ∈ [W τ − 1] such that F = x t−(W +c)+1 ...x t .For simplicity, we assume that τ −1 and W τ are integers.Unless explicitly specified, the base of all logs is 2.Algorithms for the Slack Summing problem are required to support two operations:1.Update(x t ) Process a new element x t ∈ [R].2.Output () Return a pair S, c such that c ∈ N is the slack size and S is an estimation of the last W + c elements sum, i.e., S algorithm A solves (W, τ )-Exact Summing if its Output returns S, c that satisfies 0 ≤ c < W τ and S = S.

2. Additive algorithms: we
say that A solves (W, τ, )-Additive Summing if its Output function returns S, c that satisfies 0 ≤ c < W τ and |S − S| < RW .