Search Results

Documents authored by Kuszmaul, William


Document
Track A: Algorithms, Complexity and Games
Towards an Analysis of Quadratic Probing

Authors: William Kuszmaul and Zoe Xi

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)


Abstract
Since 1968, one of the simplest open questions in the theory of hash tables has been to prove anything nontrivial about the correctness of quadratic probing. We make the first tangible progress towards this goal, showing that there exists a positive-constant load factor at which quadratic probing is a constant-expected-time hash table. Our analysis applies more generally to any fixed-offset open-addressing hash table, and extends to higher load factors in the case where the hash table examines blocks of some size B = ω(1).

Cite as

William Kuszmaul and Zoe Xi. Towards an Analysis of Quadratic Probing. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 103:1-103:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{kuszmaul_et_al:LIPIcs.ICALP.2024.103,
  author =	{Kuszmaul, William and Xi, Zoe},
  title =	{{Towards an Analysis of Quadratic Probing}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{103:1--103:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.103},
  URN =		{urn:nbn:de:0030-drops-202463},
  doi =		{10.4230/LIPIcs.ICALP.2024.103},
  annote =	{Keywords: quadratic probing, hashing, open addressing, witness trees}
}
Document
Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings

Authors: Zoe Xi and William Kuszmaul

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)


Abstract
Dynamic Time Warping (DTW) is a widely used similarity measure for comparing strings that encode time series data, with applications to areas including bioinformatics, signature verification, and speech recognition. The standard dynamic-programming algorithm for DTW takes O(n²) time, and there are conditional lower bounds showing that no algorithm can do substantially better. In many applications, however, the strings x and y may contain long runs of repeated letters, meaning that they can be compressed using run-length encoding. A natural question is whether the DTW-distance between these compressed strings can be computed efficiently in terms of the lengths k and 𝓁 of the compressed strings. Recent work has shown how to achieve O(k𝓁² + 𝓁 k²) time, leaving open the question of whether a near-quadratic Õ(k𝓁)-time algorithm might exist. We show that, if a small approximation loss is permitted, then a near-quadratic time algorithm is indeed possible: our algorithm computes a (1 + ε)-approximation for DTW(x, y) in Õ(k𝓁 / ε³) time, where k and 𝓁 are the number of runs in x and y. Our algorithm allows for DTW to be computed over any metric space (Σ, δ) in which distances are O(log n)-bit integers. Surprisingly, the algorithm also works even if δ does not induce a metric space on Σ (e.g., δ need not satisfy the triangle inequality).

Cite as

Zoe Xi and William Kuszmaul. Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 90:1-90:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{xi_et_al:LIPIcs.ESA.2022.90,
  author =	{Xi, Zoe and Kuszmaul, William},
  title =	{{Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{90:1--90:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.90},
  URN =		{urn:nbn:de:0030-drops-170281},
  doi =		{10.4230/LIPIcs.ESA.2022.90},
  annote =	{Keywords: Dynamic time warping distance, approximation algorithms, run-length encodings, computational geometry}
}
Document
Track A: Algorithms, Complexity and Games
Memoryless Worker-Task Assignment with Polylogarithmic Switching Cost

Authors: Aaron Berger, William Kuszmaul, Adam Polak, Jonathan Tidor, and Nicole Wein

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
We study the basic problem of assigning memoryless workers to tasks with dynamically changing demands. Given a set of w workers and a multiset T ⊆ [t] of |T| = w tasks, a memoryless worker-task assignment function is any function ϕ that assigns the workers [w] to the tasks T based only on the current value of T. The assignment function ϕ is said to have switching cost at most k if, for every task multiset T, changing the contents of T by one task changes ϕ(T) by at most k worker assignments. The goal of memoryless worker task assignment is to construct an assignment function with the smallest possible switching cost. In past work, the problem of determining the optimal switching cost has been posed as an open question. There are no known sub-linear upper bounds, and after considerable effort, the best known lower bound remains 4 (ICALP 2020). We show that it is possible to achieve polylogarithmic switching cost. We give a construction via the probabilistic method that achieves switching cost O(log w log (wt)) and an explicit construction that achieves switching cost polylog (wt). We also prove a super-constant lower bound on switching cost: we show that for any value of w, there exists a value of t for which the optimal switching cost is w. Thus it is not possible to achieve a switching cost that is sublinear strictly as a function of w. Finally, we present an application of the worker-task assignment problem to a metric embeddings problem. In particular, we use our results to give the first low-distortion embedding from sparse binary vectors into low-dimensional Hamming space.

Cite as

Aaron Berger, William Kuszmaul, Adam Polak, Jonathan Tidor, and Nicole Wein. Memoryless Worker-Task Assignment with Polylogarithmic Switching Cost. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 19:1-19:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{berger_et_al:LIPIcs.ICALP.2022.19,
  author =	{Berger, Aaron and Kuszmaul, William and Polak, Adam and Tidor, Jonathan and Wein, Nicole},
  title =	{{Memoryless Worker-Task Assignment with Polylogarithmic Switching Cost}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{19:1--19:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.19},
  URN =		{urn:nbn:de:0030-drops-163608},
  doi =		{10.4230/LIPIcs.ICALP.2022.19},
  annote =	{Keywords: Distributed Task Allocation, Metric Embeddings, Probabilistic Method}
}
Document
Track A: Algorithms, Complexity and Games
Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game

Authors: William Kuszmaul and Shyam Narayanan

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
The p-processor cup game is a classic and widely studied scheduling problem that captures the setting in which a p-processor machine must assign tasks to processors over time in order to ensure that no individual task ever falls too far behind. The problem is formalized as a multi-round game in which two players, a filler (who assigns work to tasks) and an emptier (who schedules tasks) compete. The emptier’s goal is to minimize backlog, which is the maximum amount of outstanding work for any task. Recently, Kuszmaul and Westover (ITCS, 2021) proposed the variable-processor cup game, which considers the same problem, except that the amount of resources available to the players (i.e., the number p of processors) fluctuates between rounds of the game. They showed that this seemingly small modification fundamentally changes the dynamics of the game: whereas the optimal backlog in the fixed p-processor game is Θ(log n), independent of p, the optimal backlog in the variable-processor game is Θ(n). The latter result was only known to apply to games with exponentially many rounds, however, and it has remained an open question what the optimal tradeoff between time and backlog is for shorter games. This paper establishes a tight trade-off curve between time and backlog in the variable-processor cup game. We show that, for a game consisting of t rounds, the optimal backlog is Θ (b (t)) where b(t) = t (if t ≤ log n) t^{1/3} log^{2/3} ({n^3}/t + 1) (if log n < t ≤ n^3) n (if n ^ 3 < t). An important consequence is that the optimal backlog is Θ(n) if and only if t ≥ Ω(n³). Our techniques also allow for us to resolve several other open questions concerning how the variable-processor cup game behaves in beyond-worst-case-analysis settings.

Cite as

William Kuszmaul and Shyam Narayanan. Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 85:1-85:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{kuszmaul_et_al:LIPIcs.ICALP.2022.85,
  author =	{Kuszmaul, William and Narayanan, Shyam},
  title =	{{Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{85:1--85:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.85},
  URN =		{urn:nbn:de:0030-drops-164263},
  doi =		{10.4230/LIPIcs.ICALP.2022.85},
  annote =	{Keywords: Cup Games, Potential Functions, Greedy}
}
Document
What Does Dynamic Optimality Mean in External Memory?

Authors: Michael A. Bender, Martín Farach-Colton, and William Kuszmaul

Published in: LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)


Abstract
A data structure A is said to be dynamically optimal over a class of data structures 𝒞 if A is constant-competitive with every data structure C ∈ 𝒞. Much of the research on binary search trees in the past forty years has focused on studying dynamic optimality over the class of binary search trees that are modified via rotations (and indeed, the question of whether splay trees are dynamically optimal has gained notoriety as the so-called dynamic-optimality conjecture). Recently, researchers have extended this to consider dynamic optimality over certain classes of external-memory search trees. In particular, Demaine, Iacono, Koumoutsos, and Langerman propose a class of external-memory trees that support a notion of tree rotations, and then give an elegant data structure, called the Belga B-tree, that is within an O(log log N)-factor of being dynamically optimal over this class. In this paper, we revisit the question of how dynamic optimality should be defined in external memory. A defining characteristic of external-memory data structures is that there is a stark asymmetry between queries and inserts/updates/deletes: by making the former slightly asymptotically slower, one can make the latter significantly asymptotically faster (even allowing for operations with sub-constant amortized I/Os). This asymmetry makes it so that rotation-based search trees are not optimal (or even close to optimal) in insert/update/delete-heavy external-memory workloads. To study dynamic optimality for such workloads, one must consider a different class of data structures. The natural class of data structures to consider are what we call buffered-propagation trees. Such trees can adapt dynamically to the locality properties of an input sequence in order to optimize the interactions between different inserts/updates/deletes and queries. We also present a new form of beyond-worst-case analysis that allows for us to formally study a continuum between static and dynamic optimality. Finally, we give a novel data structure, called the Jεllo Tree, that is statically optimal and that achieves dynamic optimality for a large natural class of inputs defined by our beyond-worst-case analysis.

Cite as

Michael A. Bender, Martín Farach-Colton, and William Kuszmaul. What Does Dynamic Optimality Mean in External Memory?. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 18:1-18:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{bender_et_al:LIPIcs.ITCS.2022.18,
  author =	{Bender, Michael A. and Farach-Colton, Mart{\'\i}n and Kuszmaul, William},
  title =	{{What Does Dynamic Optimality Mean in External Memory?}},
  booktitle =	{13th Innovations in Theoretical Computer Science Conference (ITCS 2022)},
  pages =	{18:1--18:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-217-4},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{215},
  editor =	{Braverman, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2022.18},
  URN =		{urn:nbn:de:0030-drops-156145},
  doi =		{10.4230/LIPIcs.ITCS.2022.18},
  annote =	{Keywords: Dynamic optimality, external memory, buffer propagation, search trees}
}
Document
Incremental Edge Orientation in Forests

Authors: Michael A. Bender, Tsvi Kopelowitz, William Kuszmaul, Ely Porat, and Clifford Stein

Published in: LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)


Abstract
For any forest G = (V, E) it is possible to orient the edges E so that no vertex in V has out-degree greater than 1. This paper considers the incremental edge-orientation problem, in which the edges E arrive over time and the algorithm must maintain a low-out-degree edge orientation at all times. We give an algorithm that maintains a maximum out-degree of 3 while flipping at most O(log log n) edge orientations per edge insertion, with high probability in n. The algorithm requires worst-case time O(log n log log n) per insertion, and takes amortized time O(1). The previous state of the art required up to O(log n / log log n) edge flips per insertion. We then apply our edge-orientation results to the problem of dynamic Cuckoo hashing. The problem of designing simple families ℋ of hash functions that are compatible with Cuckoo hashing has received extensive attention. These families ℋ are known to satisfy static guarantees, but do not come typically with dynamic guarantees for the running time of inserts and deletes. We show how to transform static guarantees (for 1-associativity) into near-state-of-the-art dynamic guarantees (for O(1)-associativity) in a black-box fashion. Rather than relying on the family ℋ to supply randomness, as in past work, we instead rely on randomness within our table-maintenance algorithm.

Cite as

Michael A. Bender, Tsvi Kopelowitz, William Kuszmaul, Ely Porat, and Clifford Stein. Incremental Edge Orientation in Forests. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bender_et_al:LIPIcs.ESA.2021.12,
  author =	{Bender, Michael A. and Kopelowitz, Tsvi and Kuszmaul, William and Porat, Ely and Stein, Clifford},
  title =	{{Incremental Edge Orientation in Forests}},
  booktitle =	{29th Annual European Symposium on Algorithms (ESA 2021)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-204-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{204},
  editor =	{Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.12},
  URN =		{urn:nbn:de:0030-drops-145933},
  doi =		{10.4230/LIPIcs.ESA.2021.12},
  annote =	{Keywords: edge orientation, graph algorithms, Cuckoo hashing, hash functions}
}
Document
The Variable-Processor Cup Game

Authors: William Kuszmaul and Alek Westover

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)


Abstract
The problem of scheduling tasks on p processors so that no task ever gets too far behind is often described as a game with cups and water. In the p-processor cup game on n cups, there are two players, a filler and an emptier, that take turns adding and removing water from a set of n cups. In each turn, the filler adds p units of water to the cups, placing at most 1 unit of water in each cup, and then the emptier selects p cups to remove up to 1 unit of water from. The emptier’s goal is to minimize the backlog, which is the height of the fullest cup. The p-processor cup game has been studied in many different settings, dating back to the late 1960’s. All of the past work shares one common assumption: that p is fixed. This paper initiates the study of what happens when the number of available processors p varies over time, resulting in what we call the variable-processor cup game. Remarkably, the optimal bounds for the variable-processor cup game differ dramatically from its classical counterpart. Whereas the p-processor cup has optimal backlog Θ(log n), the variable-processor game has optimal backlog Θ(n). Moreover, there is an efficient filling strategy that yields backlog Ω(n^{1 - ε}) in quasi-polynomial time against any deterministic emptying strategy. We additionally show that straightforward uses of randomization cannot be used to help the emptier. In particular, for any positive constant Δ, and any Δ-greedy-like randomized emptying algorithm 𝒜, there is a filling strategy that achieves backlog Ω(n^{1 - ε}) against 𝒜 in quasi-polynomial time.

Cite as

William Kuszmaul and Alek Westover. The Variable-Processor Cup Game. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 16:1-16:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{kuszmaul_et_al:LIPIcs.ITCS.2021.16,
  author =	{Kuszmaul, William and Westover, Alek},
  title =	{{The Variable-Processor Cup Game}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{16:1--16:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.16},
  URN =		{urn:nbn:de:0030-drops-135559},
  doi =		{10.4230/LIPIcs.ITCS.2021.16},
  annote =	{Keywords: scheduling, cup games, online algorithms, lower bounds}
}
Document
Train Tracks with Gaps

Authors: William Kuszmaul

Published in: LIPIcs, Volume 157, 10th International Conference on Fun with Algorithms (FUN 2021) (2020)


Abstract
We identify a tradeoff curve between the number of wheels on a train car, and the amount of track that must be installed in order to ensure that the train car is supported by the track at all times. The goal is to build an elevated track that covers some large distance 𝓁, but that consists primarily of gaps, so that the total amount of feet of train track that is actually installed is only a small fraction of 𝓁. In order so that the train track can support the train at all points, the requirement is that as the train drives across the track, at least one set of wheels from the rear quarter and at least one set of wheels from the front quarter of the train must be touching the track at all times. We show that, if a train car has n sets of wheels evenly spaced apart in its rear and n sets of wheels evenly spaced apart in its front, then it is possible to build a train track that supports the train car but uses only Θ(𝓁 / n) feet of track. We then consider what happens if the wheels on the train car are not evenly spaced (and may even be configured adversarially). We show that for any configuration of the train car, with n wheels in each of the front and rear quarters of the car, it is possible to build a track that supports the car for distance 𝓁 and uses only O((𝓁 log n)/n) feet of track. Additionally, we show that there exist configurations of the train car for which this tradeoff curve is asymptotically optimal. Both the upper and lower bounds are achieved via applications of the probabilistic method. The algorithms and lower bounds in this paper provide simple illustrative examples of many of the core techniques in probabilistic combinatorics and randomized algorithms. These include the probabilistic method with alterations, the use of McDiarmid’s inequality within the probabilistic method, the algorithmic Lovász Local Lemma, the min-hash technique, and the method of conditional probabilities.

Cite as

William Kuszmaul. Train Tracks with Gaps. In 10th International Conference on Fun with Algorithms (FUN 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 157, pp. 19:1-19:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{kuszmaul:LIPIcs.FUN.2021.19,
  author =	{Kuszmaul, William},
  title =	{{Train Tracks with Gaps}},
  booktitle =	{10th International Conference on Fun with Algorithms (FUN 2021)},
  pages =	{19:1--19:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-145-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{157},
  editor =	{Farach-Colton, Martin and Prencipe, Giuseppe and Uehara, Ryuhei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FUN.2021.19},
  URN =		{urn:nbn:de:0030-drops-127800},
  doi =		{10.4230/LIPIcs.FUN.2021.19},
  annote =	{Keywords: probabilistic method, algorithms, trains, Lov\'{a}sz Local Lemma, McDiarmid’s Inequality}
}
Document
Track A: Algorithms, Complexity and Games
Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation

Authors: William Kuszmaul

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
Dynamic time warping distance (DTW) is a widely used distance measure between time series, with applications in areas such as speech recognition and bioinformatics. The best known algorithms for computing DTW run in near quadratic time, and conditional lower bounds prohibit the existence of significantly faster algorithms. The lower bounds do not prevent a faster algorithm for the important special case in which the DTW is small, however. For an arbitrary metric space Sigma with distances normalized so that the smallest non-zero distance is one, we present an algorithm which computes dtw(x, y) for two strings x and y over Sigma in time O(n * dtw(x, y)). When dtw(x, y) is small, this represents a significant speedup over the standard quadratic-time algorithm. Using our low-distance regime algorithm as a building block, we also present an approximation algorithm which computes dtw(x, y) within a factor of O(n^epsilon) in time O~(n^{2 - epsilon}) for 0 < epsilon < 1. The algorithm allows for the strings x and y to be taken over an arbitrary well-separated tree metric with logarithmic depth and at most exponential aspect ratio. Notably, any polynomial-size metric space can be efficiently embedded into such a tree metric with logarithmic expected distortion. Extending our techniques further, we also obtain the first approximation algorithm for edit distance to work with characters taken from an arbitrary metric space, providing an n^epsilon-approximation in time O~(n^{2 - epsilon}), with high probability. Finally, we turn our attention to the relationship between edit distance and dynamic time warping distance. We prove a reduction from computing edit distance over an arbitrary metric space to computing DTW over the same metric space, except with an added null character (whose distance to a letter l is defined to be the edit-distance insertion cost of l). Applying our reduction to a conditional lower bound of Bringmann and Künnemann pertaining to edit distance over {0, 1}, we obtain a conditional lower bound for computing DTW over a three letter alphabet (with distances of zero and one). This improves on a previous result of Abboud, Backurs, and Williams, who gave a conditional lower bound for DTW over an alphabet of size five. With a similar approach, we also prove a reduction from computing edit distance (over generalized Hamming Space) to computing longest-common-subsequence length (LCS) over an alphabet with an added null character. Surprisingly, this means that one can recover conditional lower bounds for LCS directly from those for edit distance, which was not previously thought to be the case.

Cite as

William Kuszmaul. Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 80:1-80:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{kuszmaul:LIPIcs.ICALP.2019.80,
  author =	{Kuszmaul, William},
  title =	{{Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{80:1--80:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.80},
  URN =		{urn:nbn:de:0030-drops-106568},
  doi =		{10.4230/LIPIcs.ICALP.2019.80},
  annote =	{Keywords: dynamic time warping, edit distance, approximation algorithm, tree metrics}
}
Document
The One-Way Communication Complexity of Dynamic Time Warping Distance

Authors: Vladimir Braverman, Moses Charikar, William Kuszmaul, David P. Woodruff, and Lin F. Yang

Published in: LIPIcs, Volume 129, 35th International Symposium on Computational Geometry (SoCG 2019)


Abstract
We resolve the randomized one-way communication complexity of Dynamic Time Warping (DTW) distance. We show that there is an efficient one-way communication protocol using O~(n/alpha) bits for the problem of computing an alpha-approximation for DTW between strings x and y of length n, and we prove a lower bound of Omega(n / alpha) bits for the same problem. Our communication protocol works for strings over an arbitrary metric of polynomial size and aspect ratio, and we optimize the logarithmic factors depending on properties of the underlying metric, such as when the points are low-dimensional integer vectors equipped with various metrics or have bounded doubling dimension. We also consider linear sketches of DTW, showing that such sketches must have size Omega(n).

Cite as

Vladimir Braverman, Moses Charikar, William Kuszmaul, David P. Woodruff, and Lin F. Yang. The One-Way Communication Complexity of Dynamic Time Warping Distance. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 16:1-16:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{braverman_et_al:LIPIcs.SoCG.2019.16,
  author =	{Braverman, Vladimir and Charikar, Moses and Kuszmaul, William and Woodruff, David P. and Yang, Lin F.},
  title =	{{The One-Way Communication Complexity of Dynamic Time Warping Distance}},
  booktitle =	{35th International Symposium on Computational Geometry (SoCG 2019)},
  pages =	{16:1--16:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-104-7},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{129},
  editor =	{Barequet, Gill and Wang, Yusu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2019.16},
  URN =		{urn:nbn:de:0030-drops-104203},
  doi =		{10.4230/LIPIcs.SoCG.2019.16},
  annote =	{Keywords: dynamic time warping, one-way communication complexity, tree metrics}
}
Document
On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings

Authors: Moses Charikar, Ofir Geri, Michael P. Kim, and William Kuszmaul

Published in: LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)


Abstract
Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science. While the problem of estimating edit distance has been studied extensively, the equally important question of actually producing an alignment (i.e., the sequence of edits) has received far less attention. Somewhat surprisingly, we show that any algorithm to estimate edit distance can be used in a black-box fashion to produce an approximate alignment of strings, with modest loss in approximation factor and small loss in run time. Plugging in the result of Andoni, Krauthgamer, and Onak, we obtain an alignment that is a (log n)^{O(1/epsilon^2)} approximation in time O~(n^{1 + epsilon}). Closely related to the study of approximation algorithms is the study of metric embeddings for edit distance. We show that min-hash techniques can be useful in designing edit distance embeddings through three results: (1) An embedding from Ulam distance (edit distance over permutations) to Hamming space that matches the best known distortion of O(log n) and also implicitly encodes a sequence of edits between the strings; (2) In the case where the edit distance between the input strings is known to have an upper bound K, we show that embeddings of edit distance into Hamming space with distortion f(n) can be modified in a black-box fashion to give distortion O(f(poly(K))) for a class of periodic-free strings; (3) A randomized dimension-reduction map with contraction c and asymptotically optimal expected distortion O(c), improving on the previous O~(c^{1 + 2 / log log log n}) distortion result of Batu, Ergun, and Sahinalp.

Cite as

Moses Charikar, Ofir Geri, Michael P. Kim, and William Kuszmaul. On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 34:1-34:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{charikar_et_al:LIPIcs.ICALP.2018.34,
  author =	{Charikar, Moses and Geri, Ofir and Kim, Michael P. and Kuszmaul, William},
  title =	{{On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings}},
  booktitle =	{45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)},
  pages =	{34:1--34:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-076-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{107},
  editor =	{Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.34},
  URN =		{urn:nbn:de:0030-drops-90383},
  doi =		{10.4230/LIPIcs.ICALP.2018.34},
  annote =	{Keywords: edit distance, alignment, approximation algorithms, embedding, dimension reduction}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail