Farach-Colton, Martin

Snake in Optimal Space and Time

Authors: Philip Bille, Martín Farach-Colton, Inge Li Gørtz, and Ivor van der Hoog

Published in: LIPIcs, Volume 291, 12th International Conference on Fun with Algorithms (FUN 2024)

We revisit the classic game of Snake and ask the basic data structural question: how many bits does it take to represent the state of a snake game so that it can be updated in constant time? Our main result is a data structure that uses optimal space (within constant factors). To achieve our results, we introduce several interesting data structural techniques, including a decomposition technique for the problem, a tabulation scheme for encoding small subproblems, and a dynamic memory allocation scheme.

Philip Bille, Martín Farach-Colton, Inge Li Gørtz, and Ivor van der Hoog. Snake in Optimal Space and Time. In 12th International Conference on Fun with Algorithms (FUN 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 291, pp. 3:1-3:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

From Big Data Theory to Big Data Practice (Dagstuhl Seminar 23071)

Authors: Martin Farach-Colton, Fabian Daniel Kuhn, Ronitt Rubinfeld, and Przemysław Uznański

Published in: Dagstuhl Reports, Volume 13, Issue 2 (2023)

This report documents the program and the outcomes of Dagstuhl Seminar 23071 "From Big Data Theory to Big Data Practice". Some recent advances in the theory of algorithms for big data - sublinear/local algorithms, streaming algorithms and external memory algorithms - have translated into impressive improvements in practice, whereas others have remained stubbornly resistant to useful implementations. This seminar aimed to glean lessons for those aspect of these algorithms that have led to practical implementation to see if the lessons learned can both improve the implementations of other theoretical ideas and to help guide the next generation of theoretical advances.

Martin Farach-Colton, Fabian Daniel Kuhn, Ronitt Rubinfeld, and Przemysław Uznański. From Big Data Theory to Big Data Practice (Dagstuhl Seminar 23071). In Dagstuhl Reports, Volume 13, Issue 2, pp. 33-46, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

LIPIcs, Volume 274, ESA 2023, Complete Volume

Authors: Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

LIPIcs, Volume 274, ESA 2023, Complete Volume

31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 1-1700, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Front Matter, Table of Contents, Preface, Conference Organization

Authors: Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Front Matter, Table of Contents, Preface, Conference Organization

31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 0:i-0:xxii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

What Does Dynamic Optimality Mean in External Memory?

Authors: Michael A. Bender, Martín Farach-Colton, and William Kuszmaul

Published in: LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)

A data structure A is said to be dynamically optimal over a class of data structures 𝒞 if A is constant-competitive with every data structure C ∈ 𝒞. Much of the research on binary search trees in the past forty years has focused on studying dynamic optimality over the class of binary search trees that are modified via rotations (and indeed, the question of whether splay trees are dynamically optimal has gained notoriety as the so-called dynamic-optimality conjecture). Recently, researchers have extended this to consider dynamic optimality over certain classes of external-memory search trees. In particular, Demaine, Iacono, Koumoutsos, and Langerman propose a class of external-memory trees that support a notion of tree rotations, and then give an elegant data structure, called the Belga B-tree, that is within an O(log log N)-factor of being dynamically optimal over this class. In this paper, we revisit the question of how dynamic optimality should be defined in external memory. A defining characteristic of external-memory data structures is that there is a stark asymmetry between queries and inserts/updates/deletes: by making the former slightly asymptotically slower, one can make the latter significantly asymptotically faster (even allowing for operations with sub-constant amortized I/Os). This asymmetry makes it so that rotation-based search trees are not optimal (or even close to optimal) in insert/update/delete-heavy external-memory workloads. To study dynamic optimality for such workloads, one must consider a different class of data structures. The natural class of data structures to consider are what we call buffered-propagation trees. Such trees can adapt dynamically to the locality properties of an input sequence in order to optimize the interactions between different inserts/updates/deletes and queries. We also present a new form of beyond-worst-case analysis that allows for us to formally study a continuum between static and dynamic optimality. Finally, we give a novel data structure, called the Jεllo Tree, that is statically optimal and that achieves dynamic optimality for a large natural class of inputs defined by our beyond-worst-case analysis.

Michael A. Bender, Martín Farach-Colton, and William Kuszmaul. What Does Dynamic Optimality Mean in External Memory?. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 18:1-18:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

LIPIcs, Volume 157, FUN 2021, Complete Volume

Authors: Martin Farach-Colton, Giuseppe Prencipe, and Ryuhei Uehara

Published in: LIPIcs, Volume 157, 10th International Conference on Fun with Algorithms (FUN 2021) (2020)

LIPIcs, Volume 157, FUN 2021, Complete Volume

10th International Conference on Fun with Algorithms (FUN 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 157, pp. 1-416, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Front Matter, Table of Contents, Preface, Conference Organization

Authors: Martin Farach-Colton, Giuseppe Prencipe, and Ryuhei Uehara

Published in: LIPIcs, Volume 157, 10th International Conference on Fun with Algorithms (FUN 2021) (2020)

Front Matter, Table of Contents, Preface, Conference Organization

10th International Conference on Fun with Algorithms (FUN 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 157, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Streaming Complexity of Spanning Tree Computation

Authors: Yi-Jun Chang, Martín Farach-Colton, Tsan-Sheng Hsu, and Meng-Tsung Tsai

Published in: LIPIcs, Volume 154, 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)

The semi-streaming model is a variant of the streaming model frequently used for the computation of graph problems. It allows the edges of an n-node input graph to be read sequentially in p passes using Õ(n) space. If the list of edges includes deletions, then the model is called the turnstile model; otherwise it is called the insertion-only model. In both models, some graph problems, such as spanning trees, k-connectivity, densest subgraph, degeneracy, cut-sparsifier, and (Δ+1)-coloring, can be exactly solved or (1+ε)-approximated in a single pass; while other graph problems, such as triangle detection and unweighted all-pairs shortest paths, are known to require Ω̃(n) passes to compute. For many fundamental graph problems, the tractability in these models is open. In this paper, we study the tractability of computing some standard spanning trees, including BFS, DFS, and maximum-leaf spanning trees. Our results, in both the insertion-only and the turnstile models, are as follows. - Maximum-Leaf Spanning Trees: This problem is known to be APX-complete with inapproximability constant ρ ∈ [245/244, 2). By constructing an ε-MLST sparsifier, we show that for every constant ε > 0, MLST can be approximated in a single pass to within a factor of 1+ε w.h.p. (albeit in super-polynomial time for ε ≤ ρ-1 assuming P ≠ NP) and can be approximated in polynomial time in a single pass to within a factor of ρ_n+ε w.h.p., where ρ_n is the supremum constant that MLST cannot be approximated to within using polynomial time and Õ(n) space. In the insertion-only model, these algorithms can be deterministic. - BFS Trees: It is known that BFS trees require ω(1) passes to compute, but the naïve approach needs O(n) passes. We devise a new randomized algorithm that reduces the pass complexity to O(√n), and it offers a smooth tradeoff between pass complexity and space usage. This gives a polynomial separation between single-source and all-pairs shortest paths for unweighted graphs. - DFS Trees: It is unknown whether DFS trees require more than one pass. The current best algorithm by Khan and Mehta [STACS 2019] takes Õ(h) passes, where h is the height of computed DFS trees. Note that h can be as large as Ω(m/n) for n-node m-edge graphs. Our contribution is twofold. First, we provide a simple alternative proof of this result, via a new connection to sparse certificates for k-node-connectivity. Second, we present a randomized algorithm that reduces the pass complexity to O(√n), and it also offers a smooth tradeoff between pass complexity and space usage.

Yi-Jun Chang, Martín Farach-Colton, Tsan-Sheng Hsu, and Meng-Tsung Tsai. Streaming Complexity of Spanning Tree Computation. In 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 154, pp. 34:1-34:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Syntactic Separation of Subset Satisfiability Problems

Authors: Eric Allender, Martín Farach-Colton, and Meng-Tsung Tsai

Published in: LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

Variants of the Exponential Time Hypothesis (ETH) have been used to derive lower bounds on the time complexity for certain problems, so that the hardness results match long-standing algorithmic results. In this paper, we consider a syntactically defined class of problems, and give conditions for when problems in this class require strongly exponential time to approximate to within a factor of (1-epsilon) for some constant epsilon > 0, assuming the Gap Exponential Time Hypothesis (Gap-ETH), versus when they admit a PTAS. Our class includes a rich set of problems from additive combinatorics, computational geometry, and graph theory. Our hardness results also match the best known algorithmic results for these problems.

Eric Allender, Martín Farach-Colton, and Meng-Tsung Tsai. Syntactic Separation of Subset Satisfiability Problems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 16:1-16:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Theoretical Foundations of Storage Systems (Dagstuhl Seminar 19111)

Authors: Martin Farach-Colton, Inge Li Gørtz, Rob Johnson, and Donald E. Porter

Published in: Dagstuhl Reports, Volume 9, Issue 3 (2019)

This report documents the program and the outcomes of Dagstuhl Seminar 19111 "Theoretical Foundations of Storage Systems." This seminar brought together researchers from two distinct communities - algorithms researchers with an interest in external memory and systems researchers with an interest in storage - with the objective of improving the design of future storage systems.

Martin Farach-Colton, Inge Li Gørtz, Rob Johnson, and Donald E. Porter. Theoretical Foundations of Storage Systems (Dagstuhl Seminar 19111). In Dagstuhl Reports, Volume 9, Issue 3, pp. 39-51, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Streaming Algorithms for Planar Convex Hulls

Authors: Martín Farach-Colton, Meng Li, and Meng-Tsung Tsai

Published in: LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

Many classical algorithms are known for computing the convex hull of a set of n point in R^2 using O(n) space. For large point sets, whose size exceeds the size of the working space, these algorithms cannot be directly used. The current best streaming algorithm for computing the convex hull is computationally expensive, because it needs to solve a set of linear programs. In this paper, we propose simpler and faster streaming and W-stream algorithms for computing the convex hull. Our streaming algorithm has small pass complexity, which is roughly a square root of the current best bound, and it is simpler in the sense that our algorithm mainly relies on computing the convex hulls of smaller point sets. Our W-stream algorithms, one of which is deterministic and the other of which is randomized, have nearly-optimal tradeoff between the pass complexity and space usage, as we established by a new unconditional lower bound.

Martín Farach-Colton, Meng Li, and Meng-Tsung Tsai. Streaming Algorithms for Planar Convex Hulls. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 47:1-47:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Optimal Hashing in External Memory

Authors: Alex Conway, Martín Farach-Colton, and Philip Shilane

Published in: LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

Hash tables are a ubiquitous class of dictionary data structures. However, standard hash table implementations do not translate well into the external memory model, because they do not incorporate locality for insertions. Iacono and Patrasu established an update/query tradeoff curve for external-hash tables: a hash table that performs insertions in O(lambda/B) amortized IOs requires Omega(log_lambda N) expected IOs for queries, where N is the number of items that can be stored in the data structure, B is the size of a memory transfer, M is the size of memory, and lambda is a tuning parameter. They provide a complicated hashing data structure, which we call the IP hash table, that meets this curve for lambda that is Omega(log log M + log_M N). In this paper, we present a simpler external-memory hash table, the Bundle of Arrays Hash Table (BOA), that is optimal for a narrower range of lambda. The simplicity of BOAs allows them to be readily modified to achieve the following results: - A new external-memory data structure, the Bundle of Trees Hash Table (BOT), that matches the performance of the IP hash table, while retaining some of the simplicity of the BOAs. - The Cache-Oblivious Bundle of Trees Hash Table (COBOT), the first cache-oblivious hash table. This data structure matches the optimality of BOTs and IP hash tables over the same range of lambda.

Alex Conway, Martín Farach-Colton, and Philip Shilane. Optimal Hashing in External Memory. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 39:1-39:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Mind the Gap (Invited Paper)

Authors: Martín Farach-Colton

Published in: LIPIcs, Volume 100, 9th International Conference on Fun with Algorithms (FUN 2018)

As a New Yorker, I'm painfully aware of space. There is, after all, nothing more luxurious than empty space! So when it comes to algorithms, I'm all in favor of leaving holes in my data structures. In this talk, I'll explore the advantages of pampering algorithms with some much needed breathing room.

Martín Farach-Colton. Mind the Gap (Invited Paper). In 9th International Conference on Fun with Algorithms (FUN 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 100, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Dictionaries Revisited

Authors: Martin Farach-Colton

Published in: LIPIcs, Volume 75, 16th International Symposium on Experimental Algorithms (SEA 2017)

Dictionaries are probably the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. Given their centrality to both the theory and practice of data structures, surprisingly basic questions about them remain unsolved and sometimes even unposed. This talk focuses on questions that arise from the disparity between the way large-scale dictionaries are analyzed and the way they are used in practice.

Martin Farach-Colton. Dictionaries Revisited. In 16th International Symposium on Experimental Algorithms (SEA 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 75, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

