LIPIcs, Volume 301

22nd International Symposium on Experimental Algorithms (SEA 2024)



Thumbnail PDF

Event

SEA 2024, July 23-26, 2024, Vienna, Austria

Editor

Leo Liberti
  • LIX CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, 91128 Palaiseau, France

Publication Details

  • published at: 2024-07-11
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-95977-325-6
  • DBLP: db/conf/wea/sea2024

Access Numbers

Documents

No documents found matching your filter selection.
Document
Complete Volume
LIPIcs, Volume 301, SEA 2024, Complete Volume

Authors: Leo Liberti


Abstract
LIPIcs, Volume 301, SEA 2024, Complete Volume

Cite as

22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 1-548, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Proceedings{liberti:LIPIcs.SEA.2024,
  title =	{{LIPIcs, Volume 301, SEA 2024, Complete Volume}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{1--548},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024},
  URN =		{urn:nbn:de:0030-drops-203649},
  doi =		{10.4230/LIPIcs.SEA.2024},
  annote =	{Keywords: LIPIcs, Volume 301, SEA 2024, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Leo Liberti


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 0:i-0:xviii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{liberti:LIPIcs.SEA.2024.0,
  author =	{Liberti, Leo},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{0:i--0:xviii},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.0},
  URN =		{urn:nbn:de:0030-drops-203658},
  doi =		{10.4230/LIPIcs.SEA.2024.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Move-r: Optimizing the r-index

Authors: Nico Bertram, Johannes Fischer, and Lukas Nalbach


Abstract
We present a static text index called Move-r, which is a highly optimized version of the r-index ([Travis Gagie et al., 2020] Gagie et al., 2020) that encorporates recent theoretical developments of the move data structure ([Takaaki Nishimoto and Yasuo Tabei, 2021] Nishimoto and Tabei, 2021). The r-index is the method of choice for indexing highly repetitive texts, such as different versions of a text document or DNA from the same species, as it exploits the compressibilty of the underlying data. With Move-r, we can answer count- and locate queries 2-35 (typically 15) times as fast as with any other r-index supporting locate queries while being 0.8-2.5 (typically 2) times as large. A Move-r index can be constructed 0.9-2 (typically 2) times as fast while using 1/3-1 (typically 1/2) times as much space.

Cite as

Nico Bertram, Johannes Fischer, and Lukas Nalbach. Move-r: Optimizing the r-index. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 1:1-1:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bertram_et_al:LIPIcs.SEA.2024.1,
  author =	{Bertram, Nico and Fischer, Johannes and Nalbach, Lukas},
  title =	{{Move-r: Optimizing the r-index}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{1:1--1:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.1},
  URN =		{urn:nbn:de:0030-drops-203662},
  doi =		{10.4230/LIPIcs.SEA.2024.1},
  annote =	{Keywords: Compressed Text Index, Burrows-Wheeler Transform}
}
Document
Engineering Zuffix Arrays

Authors: Paolo Boldi, Stefano Marchini, and Sebastiano Vigna


Abstract
Searching patterns in long strings is a classical algorithmic problem with countless practical applications. Suffix trees and suffix arrays (and their variants) are a long-established solution that yields linear-time search (in the size of the pattern). In [Paolo Boldi and Sebastiano Vigna, 2018] it is shown that a z-map gadget can be attached to (enhanced) suffix arrays to improve their theoretical query time, obtaining a data structure called zuffix array. The main contribution of this paper is to show that a carefully engineered implementation of the z-map gadget does provide significant speedups with respect to enhanced suffix arrays on real-world datasets, albeit doubling the required space. In particular, for large alphabets we observe a sevenfold improvement in query time with respect to enhanced suffix arrays; even in the worst case (small alphabets), the query time is almost halved. Thus, zuffix arrays provide a very interesting new point in the space-time tradeoff spectrum.

Cite as

Paolo Boldi, Stefano Marchini, and Sebastiano Vigna. Engineering Zuffix Arrays. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 2:1-2:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{boldi_et_al:LIPIcs.SEA.2024.2,
  author =	{Boldi, Paolo and Marchini, Stefano and Vigna, Sebastiano},
  title =	{{Engineering Zuffix Arrays}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{2:1--2:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.2},
  URN =		{urn:nbn:de:0030-drops-203677},
  doi =		{10.4230/LIPIcs.SEA.2024.2},
  annote =	{Keywords: Suffix trees, suffix arrays, z-fast tries}
}
Document
Practical Minimum Path Cover

Authors: Manuel Cáceres, Brendan Mumey, Santeri Toivonen, and Alexandru I. Tomescu


Abstract
Computing a minimum path cover (MPC) of a directed acyclic graph (DAG) is a fundamental problem with a myriad of applications, including reachability. Although it is known how to solve the problem by a simple reduction to minimum flow, recent theoretical advances exploit this idea to obtain algorithms parameterized by the number of paths of an MPC, known as the width. These results obtain fast [Mäkinen et al., TALG 2019] and even linear time [Cáceres et al., SODA 2022] algorithms in the small-width regime. In this paper, we present the first publicly available high-performance implementation of state-of-the-art MPC algorithms, including the parameterized approaches. Our experiments on random DAGs show that parameterized algorithms are orders-of-magnitude faster on dense graphs. Additionally, we present new fast pre-processing heuristics based on transitive edge sparsification. We show that our heuristics improve MPC-solvers by orders of magnitude.

Cite as

Manuel Cáceres, Brendan Mumey, Santeri Toivonen, and Alexandru I. Tomescu. Practical Minimum Path Cover. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{caceres_et_al:LIPIcs.SEA.2024.3,
  author =	{C\'{a}ceres, Manuel and Mumey, Brendan and Toivonen, Santeri and Tomescu, Alexandru I.},
  title =	{{Practical Minimum Path Cover}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{3:1--3:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.3},
  URN =		{urn:nbn:de:0030-drops-203687},
  doi =		{10.4230/LIPIcs.SEA.2024.3},
  annote =	{Keywords: minimum path cover, directed acyclic graph, maximum flow, parameterized algorithms, edge sparsification, algorithm engineering}
}
Document
Separator Based Data Reduction for the Maximum Cut Problem

Authors: Jonas Charfreitag, Christine Dahn, Michael Kaibel, Philip Mayer, Petra Mutzel, and Lukas Schürmann


Abstract
Preprocessing is an important ingredient for solving the maximum cut problem to optimality on real-world graphs. In our work, we derive a new framework for data reduction rules based on vertex separators. Vertex separators are sets of vertices, whose removal increases the number of connected components of a graph. Certain small separators can be found in linear time, allowing for an efficient combination of our framework with existing data reduction rules. Additionally, we complement known data reduction rules for triangles with a new one. In our computational experiments on established benchmark instances, we clearly show the effectiveness and efficiency of our proposed data reduction techniques. The resulting graphs are significantly smaller than in earlier studies and sometimes no vertex is left, so preprocessing has fully solved the instance to optimality. The introduced techniques are also shown to offer significant speedup potential for an exact state-of-the-art solver and to help a state-of-the-art heuristic to produce solutions of higher quality.

Cite as

Jonas Charfreitag, Christine Dahn, Michael Kaibel, Philip Mayer, Petra Mutzel, and Lukas Schürmann. Separator Based Data Reduction for the Maximum Cut Problem. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 4:1-4:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{charfreitag_et_al:LIPIcs.SEA.2024.4,
  author =	{Charfreitag, Jonas and Dahn, Christine and Kaibel, Michael and Mayer, Philip and Mutzel, Petra and Sch\"{u}rmann, Lukas},
  title =	{{Separator Based Data Reduction for the Maximum Cut Problem}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{4:1--4:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.4},
  URN =		{urn:nbn:de:0030-drops-203698},
  doi =		{10.4230/LIPIcs.SEA.2024.4},
  annote =	{Keywords: Data Reduction, Maximum Cut, Vertex Separators}
}
Document
Buffered Streaming Edge Partitioning

Authors: Adil Chhabra, Marcelo Fonseca Faraj, Christian Schulz, and Daniel Seemaier


Abstract
Addressing the challenges of processing massive graphs, which are prevalent in diverse fields such as social, biological, and technical networks, we introduce HeiStreamE and FreightE, two innovative (buffered) streaming algorithms designed for efficient edge partitioning of large-scale graphs. HeiStreamE utilizes an adapted Split-and-Connect graph model and a Fennel-based multilevel partitioning scheme, while FreightE partitions a hypergraph representation of the input graph. Besides ensuring superior solution quality, these approaches also overcome the limitations of existing algorithms by maintaining linear dependency on the graph size in both time and memory complexity with no dependence on the number of blocks of partition. Our comprehensive experimental analysis demonstrates that HeiStreamE outperforms current streaming algorithms and the re-streaming algorithm 2PS in partitioning quality (replication factor), and is more memory-efficient for real-world networks where the number of edges is far greater than the number of vertices. Further, FreightE is shown to produce fast and efficient partitions, particularly for higher numbers of partition blocks.

Cite as

Adil Chhabra, Marcelo Fonseca Faraj, Christian Schulz, and Daniel Seemaier. Buffered Streaming Edge Partitioning. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 5:1-5:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{chhabra_et_al:LIPIcs.SEA.2024.5,
  author =	{Chhabra, Adil and Fonseca Faraj, Marcelo and Schulz, Christian and Seemaier, Daniel},
  title =	{{Buffered Streaming Edge Partitioning}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{5:1--5:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.5},
  URN =		{urn:nbn:de:0030-drops-203701},
  doi =		{10.4230/LIPIcs.SEA.2024.5},
  annote =	{Keywords: graph partitioning, edge partitioning, streaming, online, buffered partitioning}
}
Document
Faster Treewidth-Based Approximations for Wiener Index

Authors: Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Pavel Hudec, Pingjiang Li, and Harshit Jitendra Motwani


Abstract
The Wiener index of a graph G is the sum of distances between all pairs of its vertices. It is a widely-used graph property in chemistry, initially introduced to examine the link between boiling points and structural properties of alkanes, which later found notable applications in drug design. Thus, computing or approximating the Wiener index of molecular graphs, i.e. graphs in which every vertex models an atom of a molecule and every edge models a bond, is of significant interest to the computational chemistry community. In this work, we build upon the observation that molecular graphs are sparse and tree-like and focus on developing efficient algorithms parameterized by treewidth to approximate the Wiener index. We present a new randomized approximation algorithm using a combination of tree decompositions and centroid decompositions. Our algorithm approximates the Wiener index within any desired multiplicative factor (1 ± ε) in time O(n ⋅ log n ⋅ k³ + √n ⋅ k/ε²), where n is the number of vertices of the graph and k is the treewidth. This time bound is almost-linear in n. Finally, we provide experimental results over standard benchmark molecules from PubChem and the Protein Data Bank, showcasing the applicability and scalability of our approach on real-world chemical graphs and comparing it with previous methods.

Cite as

Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Pavel Hudec, Pingjiang Li, and Harshit Jitendra Motwani. Faster Treewidth-Based Approximations for Wiener Index. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{conrado_et_al:LIPIcs.SEA.2024.6,
  author =	{Conrado, Giovanna Kobus and Goharshady, Amir Kafshdar and Hudec, Pavel and Li, Pingjiang and Motwani, Harshit Jitendra},
  title =	{{Faster Treewidth-Based Approximations for Wiener Index}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.6},
  URN =		{urn:nbn:de:0030-drops-203718},
  doi =		{10.4230/LIPIcs.SEA.2024.6},
  annote =	{Keywords: Computational Chemistry, Treewidth, Wiener Index}
}
Document
Local Search k-means++ with Foresight

Authors: Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt, and Melanie Schmidt


Abstract
Since its introduction in 1957, Lloyd’s algorithm for k-means clustering has been extensively studied and has undergone several improvements. While in its original form it does not guarantee any approximation factor at all, Arthur and Vassilvitskii (SODA 2007) proposed k-means++ which enhances Lloyd’s algorithm by a seeding method which guarantees a 𝒪(log k)-approximation in expectation. More recently, Lattanzi and Sohler (ICML 2019) proposed LS++ which further improves the solution quality of k-means++ by local search techniques to obtain a 𝒪(1)-approximation. On the practical side, the greedy variant of k-means++ is often used although its worst-case behaviour is provably worse than for the standard k-means++ variant. We investigate how to improve LS++ further in practice. We study two options for improving the practical performance: (a) Combining LS++ with greedy k-means++ instead of k-means++, and (b) Improving LS++ by better entangling it with Lloyd’s algorithm. Option (a) worsens the theoretical guarantees of k-means++ but improves the practical quality also in combination with LS++ as we confirm in our experiments. Option (b) is our new algorithm, Foresight LS++. We experimentally show that FLS++ improves upon the solution quality of LS++. It retains its asymptotic runtime and its worst-case approximation bounds.

Cite as

Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt, and Melanie Schmidt. Local Search k-means++ with Foresight. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 7:1-7:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{conrads_et_al:LIPIcs.SEA.2024.7,
  author =	{Conrads, Theo and Drexler, Lukas and K\"{o}nen, Joshua and Schmidt, Daniel R. and Schmidt, Melanie},
  title =	{{Local Search k-means++ with Foresight}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{7:1--7:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.7},
  URN =		{urn:nbn:de:0030-drops-203727},
  doi =		{10.4230/LIPIcs.SEA.2024.7},
  annote =	{Keywords: k-means clustering, kmeans++, greedy, local search}
}
Document
Practical Computation of Graph VC-Dimension

Authors: David Coudert, Mónika Csikós, Guillaume Ducoffe, and Laurent Viennot


Abstract
For any set system ℋ = (V,ℛ), ℛ ⊆ 2^V, a subset S ⊆ V is called shattered if every S' ⊆ S results from the intersection of S with some set in ℛ. The VC-dimension of ℋ is the size of a largest shattered set in V. In this paper, we focus on the problem of computing the VC-dimension of graphs. In particular, given a graph G = (V,E), the VC-dimension of G is defined as the VC-dimension of (V, N), where N contains each subset of V that can be obtained as the closed neighborhood of some vertex v ∈ V in G. Our main contribution is an algorithm for computing the VC-dimension of any graph, whose effectiveness is shown through experiments on various types of practical graphs, including graphs with millions of vertices. A key aspect of its efficiency resides in the fact that practical graphs have small VC-dimension, up to 8 in our experiments. As a side-product, we present several new bounds relating the graph VC-dimension to other classical graph theoretical notions. We also establish the W[1]-hardness of the graph VC-dimension problem by extending a previous result for arbitrary set systems.

Cite as

David Coudert, Mónika Csikós, Guillaume Ducoffe, and Laurent Viennot. Practical Computation of Graph VC-Dimension. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 8:1-8:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{coudert_et_al:LIPIcs.SEA.2024.8,
  author =	{Coudert, David and Csik\'{o}s, M\'{o}nika and Ducoffe, Guillaume and Viennot, Laurent},
  title =	{{Practical Computation of Graph VC-Dimension}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{8:1--8:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.8},
  URN =		{urn:nbn:de:0030-drops-203731},
  doi =		{10.4230/LIPIcs.SEA.2024.8},
  annote =	{Keywords: VC-dimension, graph, algorithm}
}
Document
Top-k Frequent Patterns in Streams and Parameterized-Space LZ Compression

Authors: Patrick Dinklage, Johannes Fischer, and Nicola Prezza


Abstract
We present novel online approximations of the Lempel-Ziv 77 (LZ77) and Lempel-Ziv 78 (LZ78) compression schemes [Lempel & Ziv, 1977/1978] with parameterizable space usage based on estimating which k patterns occur the most frequently in the streamed input for parameter k. This new approach overcomes the issue of finding only local repetitions, which is a natural limitation of algorithms that compress using a sliding window or by partitioning the input into blocks. For this, we introduce the top-k trie, a summary for maintaining online the top-k frequent consecutive patterns in a stream of characters based on a combination of the Lempel-Ziv 78 compression scheme and the Misra-Gries algorithm for frequent item estimation in streams. Using straightforward encoding, our implementations yield compression ratios (output over input size) competitive with established general-purpose LZ-based compression utilities such as gzip or xz.

Cite as

Patrick Dinklage, Johannes Fischer, and Nicola Prezza. Top-k Frequent Patterns in Streams and Parameterized-Space LZ Compression. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 9:1-9:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{dinklage_et_al:LIPIcs.SEA.2024.9,
  author =	{Dinklage, Patrick and Fischer, Johannes and Prezza, Nicola},
  title =	{{Top-k Frequent Patterns in Streams and Parameterized-Space LZ Compression}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{9:1--9:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.9},
  URN =		{urn:nbn:de:0030-drops-203748},
  doi =		{10.4230/LIPIcs.SEA.2024.9},
  annote =	{Keywords: compression, streaming, heavy hitters, algorithm engineering}
}
Document
Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests

Authors: Dominika Draesslerová, Omar Ahmed, Travis Gagie, Jan Holub, Ben Langmead, Giovanni Manzini, and Gonzalo Navarro


Abstract
For taxonomic classification, we are asked to index the genomes in a phylogenetic tree such that later, given a DNA read, we can quickly choose a small subtree likely to contain the genome from which that read was drawn. Although popular classifiers such as Kraken use k-mers, recent research indicates that using maximal exact matches (MEMs) can lead to better classifications. For example, we can - build an augmented FM-index over the the genomes in the tree concatenated in left-to-right order; - for each MEM in a read, find the interval in the suffix array containing the starting positions of that MEM’s occurrences in those genomes; - find the minimum and maximum values stored in that interval; - take the lowest common ancestor (LCA) of the genomes containing the characters at those positions. This solution is practical, however, only when the total size of the genomes in the tree is fairly small. In this paper we consider applying the same solution to three lossily compressed representations of the genomes' concatenation: - a KATKA kernel, which discards characters that are not in the first or last occurrence of any k_max-tuple, for a parameter k_max; - a minimizer digest; - a KATKA kernel of a minimizer digest. With a test dataset and these three representations of it, simulated reads and various parameter settings, we checked how many reads' longest MEMs occurred only in the sequences from which those reads were generated ("true positive" reads). For some parameter settings we achieved significant compression while only slightly decreasing the true-positive rate.

Cite as

Dominika Draesslerová, Omar Ahmed, Travis Gagie, Jan Holub, Ben Langmead, Giovanni Manzini, and Gonzalo Navarro. Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 10:1-10:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{draesslerova_et_al:LIPIcs.SEA.2024.10,
  author =	{Draesslerov\'{a}, Dominika and Ahmed, Omar and Gagie, Travis and Holub, Jan and Langmead, Ben and Manzini, Giovanni and Navarro, Gonzalo},
  title =	{{Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{10:1--10:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.10},
  URN =		{urn:nbn:de:0030-drops-203756},
  doi =		{10.4230/LIPIcs.SEA.2024.10},
  annote =	{Keywords: Taxonomic classification, metagenomics, KATKA, maximal exact matches, string kernels, minimizer digests}
}
Document
Engineering Weighted Connectivity Augmentation Algorithms

Authors: Marcelo Fonseca Faraj, Ernestine Großmann, Felix Joos, Thomas Möller, and Christian Schulz


Abstract
Increasing the connectivity of a graph is a pivotal challenge in robust network design. The weighted connectivity augmentation problem is a common version of the problem that takes link costs into consideration. The problem is then to find a minimum cost subset of a given set of weighted links that increases the connectivity of a graph by one when the links are added to the edge set of the input instance. In this work, we give a first implementation of recently discovered better-than-2 approximations. Furthermore, we propose three new heuristics and one exact approach. These include a greedy algorithm considering link costs and the number of unique cuts covered, an approach based on minimum spanning trees and a local search algorithm that may improve a given solution by swapping links of paths. Our exact approach uses an ILP formulation with efficient cut enumeration as well as a fast initialization routine. We then perform an extensive experimental evaluation which shows that our algorithms are faster and yield the best solutions compared to the current state-of-the-art as well as the recently discovered better-than-2 approximation algorithms. Our novel local search algorithm can improve solution quality even further.

Cite as

Marcelo Fonseca Faraj, Ernestine Großmann, Felix Joos, Thomas Möller, and Christian Schulz. Engineering Weighted Connectivity Augmentation Algorithms. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 11:1-11:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{faraj_et_al:LIPIcs.SEA.2024.11,
  author =	{Faraj, Marcelo Fonseca and Gro{\ss}mann, Ernestine and Joos, Felix and M\"{o}ller, Thomas and Schulz, Christian},
  title =	{{Engineering Weighted Connectivity Augmentation Algorithms}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{11:1--11:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.11},
  URN =		{urn:nbn:de:0030-drops-203768},
  doi =		{10.4230/LIPIcs.SEA.2024.11},
  annote =	{Keywords: weighted connectivity augmentation, approximation, heuristic, integer linear program, algorithm engineering}
}
Document
Streaming Matching and Edge Cover in Practice

Authors: S M Ferdous, Alex Pothen, and Mahantesh Halappanavar


Abstract
Graph algorithms with polynomial space and time requirements often become infeasible for massive graphs with billions of edges or more. State-of-the-art approaches therefore employ approximate serial, parallel, and distributed algorithms to tackle these challenges. However, such approaches require storing the entire graph in memory and thus need access to costly computing resources such as clusters and supercomputers. In this paper, we present practical streaming approaches for solving massive graph problems using limited memory for two prototypical graph problems: maximum weighted matching and minimum weighted edge cover. For matching, we conduct a thorough computational study on two of the semi-streaming algorithms including a recent breakthrough result that achieves a 1/(2+ε)-approximation of the weight while using O(n log W /ε) memory (here n is the number of vertices and W is the maximum edge weight), designed by Paz and Schwartzman [SODA, 2017]. Empirically, we show that the semi-streaming algorithms produce matchings whose weight is close to the best 1/2-approximate offline algorithm while requiring less time and an order-of-magnitude less memory. For minimum weighted edge cover, we develop three novel semi-streaming algorithms. Two of these algorithms require a single pass through the input graph, require O(n log n) memory, and provide a 2-approximation guarantee on the objective. We also leverage a relationship between approximate maximum weighted matching and approximate minimum weighted edge cover to develop a two-pass 3/2+ε-approximate algorithm with the memory requirement of Paz and Schwartzman’s semi-streaming matching algorithm. These streaming approaches are compared against the state-of-the-art 3/2-approximate offline algorithm. The semi-streaming matching and the novel edge cover algorithms proposed in this paper can process graphs with several billions of edges in under 30 minutes using 6 GB of memory, which is at least an order of magnitude improvement from the offline (non-streaming) algorithms. For the largest graph, the best alternative offline parallel approximation algorithm (GPA+ROMA) could not finish in three hours even while employing hundreds of processors and 1 TB of memory. We also demonstrate an application of semi-streaming algorithm by computing a matching using linearly bounded memory on intersection graphs derived from three machine learning datasets, while the existing offline algorithms could not complete on one of these datasets since its memory requirement exceeded 1TB.

Cite as

S M Ferdous, Alex Pothen, and Mahantesh Halappanavar. Streaming Matching and Edge Cover in Practice. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 12:1-12:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{ferdous_et_al:LIPIcs.SEA.2024.12,
  author =	{Ferdous, S M and Pothen, Alex and Halappanavar, Mahantesh},
  title =	{{Streaming Matching and Edge Cover in Practice}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{12:1--12:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.12},
  URN =		{urn:nbn:de:0030-drops-203773},
  doi =		{10.4230/LIPIcs.SEA.2024.12},
  annote =	{Keywords: Matching, Edge Cover, Semi-Streaming Algorithm, Parallel Algorithms, Algorithm Engineering}
}
Document
3/2-Dual Approximation for CPU/GPU Scheduling

Authors: Bernhard Sebastian Germann, Klaus Jansen, Felix Ohnesorge, and Malte Tutas


Abstract
We present a fast and efficient 3/2 dual approximation algorithm for CPU/GPU scheduling under the objective of makespan minimization. In CPU/GPU scheduling tasks can be scheduled on two different architectures. When executed on the CPU, a task is moldable and can be assigned to multiple cores. The running time becomes a function in the assigned cores. On a GPU, the task is a classical job with a set processing time. Both settings have drawn recent independent scientific interest. For the moldable CPU scheduling, the current best known constant rate approximation is a 3/2 approximation algorithm [Wu et al. EJOR volume 306]. The best efficient algorithm for this setting is a 3/2+ε approximation [Mounie et al. SIAM '07] whereas GPU scheduling admits a 13/11 approximation [Coffman, Garey, Johnson SIAM'78]. We improve upon the current best known algorithms for CPU/GPU scheduling by Bleuse et al. by formulating a novel multidimensional multiple choice knapsack to allot tasks to either architecture and schedule them there with known algorithms. This yields an improved running time over the current state of the art. We complement our theoretical results with experimentation that shows a significant speedup by using practical optimizations and explore their efficacy.

Cite as

Bernhard Sebastian Germann, Klaus Jansen, Felix Ohnesorge, and Malte Tutas. 3/2-Dual Approximation for CPU/GPU Scheduling. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 13:1-13:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{germann_et_al:LIPIcs.SEA.2024.13,
  author =	{Germann, Bernhard Sebastian and Jansen, Klaus and Ohnesorge, Felix and Tutas, Malte},
  title =	{{3/2-Dual Approximation for CPU/GPU Scheduling}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{13:1--13:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.13},
  URN =		{urn:nbn:de:0030-drops-203782},
  doi =		{10.4230/LIPIcs.SEA.2024.13},
  annote =	{Keywords: computing, machine scheduling, moldable, CPU/GPU}
}
Document
Accelerating ILP Solvers for Minimum Flow Decompositions Through Search Space and Dimensionality Reductions

Authors: Andreas Grigorjew, Fernando H. C. Dias, Andrea Cracco, Romeo Rizzi, and Alexandru I. Tomescu


Abstract
Given a flow network, the Minimum Flow Decomposition (MFD) problem is finding the smallest possible set of weighted paths whose superposition equals the flow. It is a classical, strongly NP-hard problem that is proven to be useful in RNA transcript assembly and applications outside of Bioinformatics. We improve an existing ILP (Integer Linear Programming) model by Dias et al. [RECOMB 2022] for DAGs by decreasing the solver’s search space using solution safety and several other optimizations. This results in a significant speedup compared to the original ILP, of up to 34× on average on the hardest instances. Moreover, we show that our optimizations apply also to MFD problem variants, resulting in speedups that go up to 219× on the hardest instances. We also developed an ILP model of reduced dimensionality for an MFD variant in which the solution path weights are restricted to a given set. This model can find an optimal MFD solution for most instances, and overall, its accuracy significantly outperforms that of previous greedy algorithms while being up to an order of magnitude faster than our optimized ILP.

Cite as

Andreas Grigorjew, Fernando H. C. Dias, Andrea Cracco, Romeo Rizzi, and Alexandru I. Tomescu. Accelerating ILP Solvers for Minimum Flow Decompositions Through Search Space and Dimensionality Reductions. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{grigorjew_et_al:LIPIcs.SEA.2024.14,
  author =	{Grigorjew, Andreas and Dias, Fernando H. C. and Cracco, Andrea and Rizzi, Romeo and Tomescu, Alexandru I.},
  title =	{{Accelerating ILP Solvers for Minimum Flow Decompositions Through Search Space and Dimensionality Reductions}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{14:1--14:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.14},
  URN =		{urn:nbn:de:0030-drops-203792},
  doi =		{10.4230/LIPIcs.SEA.2024.14},
  annote =	{Keywords: Flow decomposition, Integer Linear Programming, Safety, RNA-seq, RNA transcript assembly, isoform}
}
Document
Determining Fixed-Length Paths in Directed and Undirected Edge-Weighted Graphs

Authors: Daniel Hambly, Rhyd Lewis, and Padraig Corcoran


Abstract
In this paper, we examine the NP-hard problem of identifying fixed-length s-t paths in edge-weighted graphs - that is, a path of a desired length k from a source vertex s to a target vertex t. Many existing strategies look at paths whose lengths are determined by the number of edges in the path. We, however, look at the length of the path as the sum of the edge weights. Here, three exact algorithms for this problem are proposed: the first based on an integer programming (IP) formulation, the second a backtracking algorithm, and the third based on an extension of Yen’s algorithm. Analysis of these algorithms on random graphs shows that the backtracking algorithm performs best on smaller values of k, whilst the IP is preferable for larger values of k.

Cite as

Daniel Hambly, Rhyd Lewis, and Padraig Corcoran. Determining Fixed-Length Paths in Directed and Undirected Edge-Weighted Graphs. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 15:1-15:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hambly_et_al:LIPIcs.SEA.2024.15,
  author =	{Hambly, Daniel and Lewis, Rhyd and Corcoran, Padraig},
  title =	{{Determining Fixed-Length Paths in Directed and Undirected Edge-Weighted Graphs}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{15:1--15:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.15},
  URN =		{urn:nbn:de:0030-drops-203805},
  doi =		{10.4230/LIPIcs.SEA.2024.15},
  annote =	{Keywords: Graphs, paths, backtracking, integer programming, Yen’s algorithm}
}
Document
Solving the Optimal Experiment Design Problem with Mixed-Integer Convex Methods

Authors: Deborah Hendrych, Mathieu Besançon, and Sebastian Pokutta


Abstract
We tackle the Optimal Experiment Design Problem, which consists of choosing experiments to run or observations to select from a finite set to estimate the parameters of a system. The objective is to maximize some measure of information gained about the system from the observations, leading to a convex integer optimization problem. We leverage Boscia.jl, a recent algorithmic framework, which is based on a nonlinear branch-and-bound algorithm with node relaxations solved to approximate optimality using Frank-Wolfe algorithms. One particular advantage of the method is its efficient utilization of the polytope formed by the original constraints which is preserved by the method, unlike alternative methods relying on epigraph-based formulations. We assess our method against both generic and specialized convex mixed-integer approaches. Computational results highlight the performance of our proposed method, especially on large and challenging instances.

Cite as

Deborah Hendrych, Mathieu Besançon, and Sebastian Pokutta. Solving the Optimal Experiment Design Problem with Mixed-Integer Convex Methods. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 16:1-16:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hendrych_et_al:LIPIcs.SEA.2024.16,
  author =	{Hendrych, Deborah and Besan\c{c}on, Mathieu and Pokutta, Sebastian},
  title =	{{Solving the Optimal Experiment Design Problem with Mixed-Integer Convex Methods}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{16:1--16:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.16},
  URN =		{urn:nbn:de:0030-drops-203810},
  doi =		{10.4230/LIPIcs.SEA.2024.16},
  annote =	{Keywords: Mixed-Integer Non-Linear Optimization, Optimal Experiment Design, Frank-Wolfe, Boscia}
}
Document
Barcode Selection and Layout Optimization in Spatial Transcriptomics

Authors: Frederik L. Jatzkowski, Antonia Schmidt, Robert Mank, Steffen Schüler, and Matthias Müller-Hannemann


Abstract
An important special case of the quadratic assignment problem arises in the synthesis of DNA microarrays for high-resolution spatial transcriptomics. The task is to select a suitable subset from a set of barcodes, i. e. short DNA strings that serve as unique identifiers, and to assign the selected barcodes to positions on a two-dimensional array in such a way that a position-dependent cost function is minimized. A typical microarray with dimensions of 768×1024 requires 786,432 many barcodes to be placed, leading to very challenging large-scale combinatorial optimization problems. The general quadratic assignment problem is well-known for its hardness, both in theory and in practice. It turns out that this also holds for the special case of the barcode layout problem. We show that the problem is even hard to approximate: It is MaxSNP-hard. An ILP formulation theoretically allows the computation of optimal results, but it is only applicable for tiny instances. Therefore, we have developed layout constructing and improving heuristics with the aim of computing near-optimal solutions for instances of realistic size. These include a sorting-based algorithm, a greedy algorithm, 2-OPT-based local search and a genetic algorithm. To assess the quality of the results, we compare the generated solutions with the expected cost of a random layout and with lower bounds. A combination of the greedy algorithm and 2-OPT local search produces the most promising results in terms of both quality and runtime. Solutions to large-scale instances with arrays of dimension 768×1024 show a 37% reduction in cost over a random solution and can be computed in about 3 minutes. Since the universe of suitable barcodes is much larger than the number of barcodes needed, this can be exploited. Experiments with different surpluses of barcodes show that a significant improvement in layout quality can be achieved at the cost of a reasonable increase in runtime. Another interesting finding is that the restriction of the barcode design space by biochemical constraints is actually beneficial for the overall layout cost.

Cite as

Frederik L. Jatzkowski, Antonia Schmidt, Robert Mank, Steffen Schüler, and Matthias Müller-Hannemann. Barcode Selection and Layout Optimization in Spatial Transcriptomics. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 17:1-17:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{jatzkowski_et_al:LIPIcs.SEA.2024.17,
  author =	{Jatzkowski, Frederik L. and Schmidt, Antonia and Mank, Robert and Sch\"{u}ler, Steffen and M\"{u}ller-Hannemann, Matthias},
  title =	{{Barcode Selection and Layout Optimization in Spatial Transcriptomics}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{17:1--17:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.17},
  URN =		{urn:nbn:de:0030-drops-203821},
  doi =		{10.4230/LIPIcs.SEA.2024.17},
  annote =	{Keywords: Spatial Transcriptomics, Array Layout, Optimization, Computational Complexity, GPU Computing, Integer Linear Programming, Metaheuristics}
}
Document
Experimental Analysis of LP Scaling Methods Based on Circuit Imbalance Minimization

Authors: Jakub Komárek and Martin Koutecký


Abstract
Linear programming (LP) is a fundamental problem with rich theory and wide applications. A ubiquitous technique in LP is scaling, where the input instance is transformed in some way to make its solution easier. Dadush et al. [STOC '20] have recently devised an algorithm which scales the columns of the constraint matrix of a linear program in a way that aims to minimize the circuit imbalance measure, a matrix condition number of growing theoretical interest. They show that this rescaling achieves favorable theoretical guarantees for certain LP algorithms. We follow up on their work in an experimental manner. First, we have implemented their algorithm, overcoming several engineering obstacles. Next, we have used our implementation to obtain a rescaling of 142 publicly available instances. Finally, we have performed experiments evaluating the effects of the obtained rescalings on the runtime of real-world LP solvers, and we have evaluated their quality with regard to the circuit imbalance measure.

Cite as

Jakub Komárek and Martin Koutecký. Experimental Analysis of LP Scaling Methods Based on Circuit Imbalance Minimization. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 18:1-18:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{komarek_et_al:LIPIcs.SEA.2024.18,
  author =	{Kom\'{a}rek, Jakub and Kouteck\'{y}, Martin},
  title =	{{Experimental Analysis of LP Scaling Methods Based on Circuit Imbalance Minimization}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{18:1--18:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.18},
  URN =		{urn:nbn:de:0030-drops-203832},
  doi =		{10.4230/LIPIcs.SEA.2024.18},
  annote =	{Keywords: Linear programming, scaling, circuit imbalance measure}
}
Document
Algorithms for Gradual Polyline Simplification

Authors: Nick Krumbholz, Stefan Funke, Peter Schäfer, and Sabine Storandt


Abstract
Displaying line data is important in many visualization applications, and especially in the context of interactive geographical and cartographic visualization. When rendering linear features as roads, rivers or movement data on zoomable maps, the challenge is to display the data in an appropriate level of detail. A too detailed representation results in slow rendering and cluttered maps, while a too coarse representation might miss important data aspects. In this paper, we propose the gradual line simplification (GLS) problem, which aims to compute a fine-grained succession of consistent simplifications of a given input polyline with certain quality guarantees. The core concept of gradual simplification is to iteratively remove points from the polyline to obtain increasingly coarser representations. We devise two objective functions to guide this simplification process and present dynamic programs that compute the optimal solutions in 𝒪(n³) for an input line with n points. For practical application to large inputs, we also devise significantly faster greedy algorithms that provide constant factor guarantees for both problem variants at once. In an extensive experimental study on real-world data, we demonstrate that our algorithms are capable of producing simplification sequences of high quality within milliseconds on polylines consisting of over half a million points.

Cite as

Nick Krumbholz, Stefan Funke, Peter Schäfer, and Sabine Storandt. Algorithms for Gradual Polyline Simplification. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 19:1-19:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{krumbholz_et_al:LIPIcs.SEA.2024.19,
  author =	{Krumbholz, Nick and Funke, Stefan and Sch\"{a}fer, Peter and Storandt, Sabine},
  title =	{{Algorithms for Gradual Polyline Simplification}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{19:1--19:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.19},
  URN =		{urn:nbn:de:0030-drops-203847},
  doi =		{10.4230/LIPIcs.SEA.2024.19},
  annote =	{Keywords: Polyline simplification, Progressive simplification, Fr\'{e}chet distance}
}
Document
Targeted Branching for the Maximum Independent Set Problem Using Graph Neural Networks

Authors: Kenneth Langedal, Demian Hespe, and Peter Sanders


Abstract
Identifying a maximum independent set is a fundamental NP-hard problem. This problem has several real-world applications and requires finding the largest possible set of vertices not adjacent to each other in an undirected graph. Over the past few years, branch-and-bound and branch-and-reduce algorithms have emerged as some of the most effective methods for solving the problem exactly. Specifically, the branch-and-reduce approach, which combines branch-and-bound principles with reduction rules, has proven particularly successful in tackling previously unmanageable real-world instances. This progress was largely made possible by the development of more effective reduction rules. Nevertheless, other key components that can impact the efficiency of these algorithms have not received the same level of interest. Among these is the branching strategy, which determines which vertex to branch on next. Until recently, the most widely used strategy was to choose the vertex of the highest degree. In this work, we present a graph neural network approach for selecting the next branching vertex. The intricate nature of current branch-and-bound solvers makes supervised and reinforcement learning difficult. Therefore, we use a population-based genetic algorithm to evolve the model’s parameters instead. Our proposed approach results in a speedup on 73% of the benchmark instances with a median speedup of 24%.

Cite as

Kenneth Langedal, Demian Hespe, and Peter Sanders. Targeted Branching for the Maximum Independent Set Problem Using Graph Neural Networks. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 20:1-20:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{langedal_et_al:LIPIcs.SEA.2024.20,
  author =	{Langedal, Kenneth and Hespe, Demian and Sanders, Peter},
  title =	{{Targeted Branching for the Maximum Independent Set Problem Using Graph Neural Networks}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{20:1--20:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.20},
  URN =		{urn:nbn:de:0030-drops-203853},
  doi =		{10.4230/LIPIcs.SEA.2024.20},
  annote =	{Keywords: Graphs, Independent Set, Vertex Cover, Graph Neural Networks, Branch-and-Reduce}
}
Document
SPIDER: Improved Succinct Rank and Select Performance

Authors: Matthew D. Laws, Jocelyn Bliven, Kit Conklin, Elyes Laalai, Samuel McCauley, and Zach S. Sturdevant


Abstract
Rank and select data structures seek to preprocess a bit vector to quickly answer two kinds of queries: Rank(i) gives the number of 1 bits in slots 0 through i, and Select(j) gives the first slot s with Rank(s) = j. A succinct data structure can answer these queries while using space much smaller than the size of the original bit vector. State of the art succinct rank and select data structures use as little as 4% extra space (over the underlying bit vector) while answering rank and select queries very quickly. Rank queries can be answered using only a handful of array accesses. Select queries can be answered by starting with similar array accesses, followed by a linear scan through the bit vector. Nonetheless, a tradeoff remains: data structures that use under 4% space are significantly slower at answering rank and select queries than less-space-efficient data structures (using, say, over 20% extra space). In this paper we make significantly progress towards closing this gap. We give a new data structure, SPIDER, which uses 3.82% extra space. SPIDER gives the best known rank query time for data sets of 8 billion or more bits, even compared to much less space-efficient data structures. For select queries, SPIDER outperforms all data structures that use less than 4% space, and significantly closes the gap in select performance between data structures with less than 4% space, and those that use more (over 20% for both rank and select) space. SPIDER makes two main technical contributions. For rank queries, it improves performance by interleaving the metadata with the bit vector to improve cache efficiency. For select queries, it uses predictions to almost eliminate the cost of the linear scan. These predictions are inspired by recent results on data structures with machine-learned predictions, adapted to the succinct data structure setting. Our results hold on both real and synthetic data, showing that these predictions are effective in practice.

Cite as

Matthew D. Laws, Jocelyn Bliven, Kit Conklin, Elyes Laalai, Samuel McCauley, and Zach S. Sturdevant. SPIDER: Improved Succinct Rank and Select Performance. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{laws_et_al:LIPIcs.SEA.2024.21,
  author =	{Laws, Matthew D. and Bliven, Jocelyn and Conklin, Kit and Laalai, Elyes and McCauley, Samuel and Sturdevant, Zach S.},
  title =	{{SPIDER: Improved Succinct Rank and Select Performance}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{21:1--21:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.21},
  URN =		{urn:nbn:de:0030-drops-203865},
  doi =		{10.4230/LIPIcs.SEA.2024.21},
  annote =	{Keywords: Rank and Select, Succinct Data Structures, Data Structres, Cache Performance, Predictions}
}
Document
Efficient Computation of Topological Integral Transforms

Authors: Vadim Lebovici, Steve Oudot, and Hugo Passe


Abstract
Topological integral transforms have found many applications in shape analysis, from prediction of clinical outcomes in brain cancer to analysis of barley seeds. Using Euler characteristic as a measure, these objects record rich geometric information on weighted polytopal complexes. While some implementations exist, they only enable discretized representations of the transforms, and they do not handle weighted complexes (such as for instance images). Moreover, recent hybrid transforms lack an implementation. In this paper, we introduce eucalc, a novel implementation of three topological integral transforms - the Euler characteristic transform, the Radon transform, and hybrid transforms - for weighted cubical complexes. Leveraging piecewise linear Morse theory and Euler calculus, the algorithms significantly reduce computational complexity by focusing on critical points. Our software provides exact representations of transforms, handles both binary and grayscale images, and supports multi-core processing. It is publicly available as a C++ library with a Python wrapper. We present mathematical foundations, implementation details, and experimental evaluations, demonstrating eucalc’s efficiency.

Cite as

Vadim Lebovici, Steve Oudot, and Hugo Passe. Efficient Computation of Topological Integral Transforms. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 22:1-22:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{lebovici_et_al:LIPIcs.SEA.2024.22,
  author =	{Lebovici, Vadim and Oudot, Steve and Passe, Hugo},
  title =	{{Efficient Computation of Topological Integral Transforms}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{22:1--22:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.22},
  URN =		{urn:nbn:de:0030-drops-203878},
  doi =		{10.4230/LIPIcs.SEA.2024.22},
  annote =	{Keywords: Topological data analysis, Euler calculus, Topological integral transform, Euler characteristic transform, Hybrid transforms}
}
Document
Engineering A* Search for the Flip Distance of Plane Triangulations

Authors: Philip Mayer and Petra Mutzel


Abstract
The flip distance for two triangulations of a point set is defined as the smallest number of edge flips needed to transform one triangulation into another, where an edge flip is the act of replacing an edge of a triangulation by a different edge such that the result remains a triangulation. We adapt and engineer a sophisticated A* search algorithm acting on the so-called flip graph. In particular, we prove that previously proposed lower bounds for the flip distance form consistent heuristics for A* and show that they can be computed efficiently using dynamic algorithms. As an alternative approach, we present an integer linear program (ILP) for the flip distance problem. We experimentally evaluate our approaches on a new real-world benchmark data set based on an application in geodesy, namely sea surface reconstruction. Our evaluation reveals that A* search consistently outperforms our ILP formulation as well as a naive baseline, which is bidirectional breadth-first search. In particular, the runtime of our approach improves upon the baseline by more than two orders of magnitude. Furthermore, our A* search successfully solves most of the considered sea surface instances with up to 41 points. This is a substantial improvement compared to the baseline, which struggles with subsets of the real-world data of size 25. Lastly, to allow the consideration of global sea level data, we developed a decomposition-based heuristic for the flip distance. In our experiments it yields optimal flip distance values for most of the considered sea level data and it can be applied to large data sets due to its fast runtime.

Cite as

Philip Mayer and Petra Mutzel. Engineering A* Search for the Flip Distance of Plane Triangulations. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 23:1-23:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{mayer_et_al:LIPIcs.SEA.2024.23,
  author =	{Mayer, Philip and Mutzel, Petra},
  title =	{{Engineering A* Search for the Flip Distance of Plane Triangulations}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{23:1--23:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.23},
  URN =		{urn:nbn:de:0030-drops-203887},
  doi =		{10.4230/LIPIcs.SEA.2024.23},
  annote =	{Keywords: Computational Geometry, Triangulations, Flip Distance, A-star Search, Integer Linear Programming}
}
Document
Efficient Exact Online String Matching Through Linked Weak Factors

Authors: Matthew N. Palmer, Simone Faro, and Stefano Scafiti


Abstract
Online exact string matching is a fundamental computational problem in computer science, involving the sequential search for a pattern within a large text without prior access to the entire text. Its significance is underscored by its diverse applications in data compression, data mining, text editing, and bioinformatics, just to cite a few, where efficient substring matching is crucial. While the problem has been a subject of study for years, recent decades have witnessed a heightened focus on experimental solutions, employing various techniques to achieve superior performance. Notably, approaches centered around weak factor recognition have emerged as leaders in experimental settings, gaining increasing attention. This paper introduces Hash Chain, a novel algorithm founded on a robust weak factor recognition approach that links adjacent factors through hashing. Building upon the efficacy of weak recognition techniques, the proposed algorithm incorporates innovative strategies for organizing data structures and optimizations to enhance performance. Despite its quadratic worst-case time complexity, the new proposed algorithm demonstrates sublinear behavior in practice, outperforming currently known algorithms in the literature.

Cite as

Matthew N. Palmer, Simone Faro, and Stefano Scafiti. Efficient Exact Online String Matching Through Linked Weak Factors. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 24:1-24:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{palmer_et_al:LIPIcs.SEA.2024.24,
  author =	{Palmer, Matthew N. and Faro, Simone and Scafiti, Stefano},
  title =	{{Efficient Exact Online String Matching Through Linked Weak Factors}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{24:1--24:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.24},
  URN =		{urn:nbn:de:0030-drops-203896},
  doi =		{10.4230/LIPIcs.SEA.2024.24},
  annote =	{Keywords: String matching, text processing, weak recognition, hashing, experimental algorithms, design and analysis of algorithms}
}
Document
Convex Relaxation for the Generalized Maximum-Entropy Sampling Problem

Authors: Gabriel Ponte, Marcia Fampa, and Jon Lee


Abstract
The generalized maximum-entropy sampling problem (GMESP) is to select an order-s principal submatrix from an order-n covariance matrix, to maximize the product of its t greatest eigenvalues, 0 < t ≤ s < n. It is a problem that specializes to two fundamental problems in statistical design theory: (i) maximum-entropy sampling problem (MESP); (ii) binary D-optimality (D-Opt). In the general case, it is motivated by a selection problem in the context of PCA (principal component analysis). We introduce the first convex-optimization based relaxation for GMESP, study its behavior, compare it to an earlier spectral bound, and demonstrate its use in a branch-and-bound scheme. We find that such an approach is practical when s-t is very small.

Cite as

Gabriel Ponte, Marcia Fampa, and Jon Lee. Convex Relaxation for the Generalized Maximum-Entropy Sampling Problem. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 25:1-25:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{ponte_et_al:LIPIcs.SEA.2024.25,
  author =	{Ponte, Gabriel and Fampa, Marcia and Lee, Jon},
  title =	{{Convex Relaxation for the Generalized Maximum-Entropy Sampling Problem}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{25:1--25:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.25},
  URN =		{urn:nbn:de:0030-drops-203901},
  doi =		{10.4230/LIPIcs.SEA.2024.25},
  annote =	{Keywords: maximum-entropy sampling, D-optimality, convex relaxation, branch-and-bound, integer nonlinear optimization, principal component analysis}
}
Document
Scalable Hard Instances for Independent Set Reconfiguration

Authors: Takehide Soh, Takumu Watanabe, Jun Kawahara, Akira Suzuki, and Takehiro Ito


Abstract
The Token Jumping problem, also known as the independent set reconfiguration problem under the token jumping model, is defined as follows: Given a graph and two same-sized independent sets, determine whether one can be transformed into the other via a sequence of independent sets. Token Jumping has been extensively studied, mainly from the viewpoint of algorithmic theory, but its practical study has just begun. To develop a practically good solver, it is important to construct benchmark datasets that are scalable and hard. Here, "scalable" means the ability to change the scale of the instance while maintaining its characteristics by adjusting the given parameters; and "hard" means that the instance can become so difficult that it cannot be solved within a practical time frame by a solver. In this paper, we propose four types of instance series for Token Jumping. Our instance series is scalable in the sense that instance scales are controlled by the number of vertices. To establish their hardness, we focus on the numbers of transformation steps; our instance series requires exponential numbers of steps with respect to the number of vertices. Interestingly, three types of instance series are constructed by importing theories developed by algorithmic research. We experimentally evaluate the scalability and hardness of the proposed instance series, using the SAT solver and award-winning solvers of the international competition for Token Jumping.

Cite as

Takehide Soh, Takumu Watanabe, Jun Kawahara, Akira Suzuki, and Takehiro Ito. Scalable Hard Instances for Independent Set Reconfiguration. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 26:1-26:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{soh_et_al:LIPIcs.SEA.2024.26,
  author =	{Soh, Takehide and Watanabe, Takumu and Kawahara, Jun and Suzuki, Akira and Ito, Takehiro},
  title =	{{Scalable Hard Instances for Independent Set Reconfiguration}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{26:1--26:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.26},
  URN =		{urn:nbn:de:0030-drops-203913},
  doi =		{10.4230/LIPIcs.SEA.2024.26},
  annote =	{Keywords: Combinatorial reconfiguration, Benckmark dataset, Graph Algorithm, PSPACE-complete}
}
Document
Improved Cut Strategy for Tensor Network Contraction Orders

Authors: Christoph Staudt, Mark Blacher, Julien Klaus, Farin Lippmann, and Joachim Giesen


Abstract
In the field of quantum computing, simulating quantum systems on classical computers is crucial. Tensor networks are fundamental in simulating quantum systems. A tensor network is a collection of tensors, that need to be contracted into a result tensor. Tensor contraction is a generalization of matrix multiplication to higher order tensors. The contractions can be performed in different orders, and the order has a significant impact on the number of floating point operations (flops) needed to get the result tensor. It is known that finding an optimal contraction order is NP-hard. The current state-of-the-art approach for finding efficient contraction orders is to combinine graph partitioning with a greedy strategy. Although heavily used in practice, the current approach ignores so-called free indices, chooses node weights without regarding previous computations, and requires numerous hyperparameters that need to be tuned at runtime. In this paper, we address these shortcomings by developing a novel graph cut strategy. The proposed modifications yield contraction orders that significantly reduce the number of flops in the tensor contractions compared to the current state of the art. Moreover, by removing the need for hyperparameter tuning at runtime, our approach converges to an efficient solution faster, which reduces the required optimization time by at least an order of magnitude.

Cite as

Christoph Staudt, Mark Blacher, Julien Klaus, Farin Lippmann, and Joachim Giesen. Improved Cut Strategy for Tensor Network Contraction Orders. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 27:1-27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{staudt_et_al:LIPIcs.SEA.2024.27,
  author =	{Staudt, Christoph and Blacher, Mark and Klaus, Julien and Lippmann, Farin and Giesen, Joachim},
  title =	{{Improved Cut Strategy for Tensor Network Contraction Orders}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{27:1--27:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.27},
  URN =		{urn:nbn:de:0030-drops-203924},
  doi =		{10.4230/LIPIcs.SEA.2024.27},
  annote =	{Keywords: tensor network, contraction order, graph partitioniong, quantum simulation}
}
Document
Finding the Minimum Cost Acceptable Element in a Sorted Matrix

Authors: Sebastián Urrutia and Vinicius dos Santos


Abstract
In this work we introduce the problem of finding a minimum cost acceptable element in an n × n matrix M whose columns and rows are sorted in non-decreasing order. More precisely, given a sorted matrix M and access to a given oracle function f: ℕ × ℕ → {True, False}, one has to find a pair (i, j) of indices such that f(i,j) returns True and the value M[i,j] is as small as possible. Assuming the computation of f(i,j) takes time bounded by a constant, a naive algorithm scanning all the positions of the matrix takes time O(n²). Another natural approach, based on a priority queue, takes time O(z log z) in which z stands for the position of the first pair of indices for which the oracle returns True in a sorted list of all elements of M. In the worst case, when z = n², the naive algorithm is better than the priority queue one. In this work we introduce different algorithms with complexities depending on n and z, such as O(n √z) and O(min(n²,z²)), and compare them, both theoretically and experimentally, in terms of running time and number of calls to the oracle. Among other things, we find that in most cases our algorithms do not make a significantly larger number of calls to the oracle than the priority queue-based algorithm, which achieves the minimum of such call when all elements of the matrix are distinct, while being much faster in large instances.

Cite as

Sebastián Urrutia and Vinicius dos Santos. Finding the Minimum Cost Acceptable Element in a Sorted Matrix. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 28:1-28:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{urrutia_et_al:LIPIcs.SEA.2024.28,
  author =	{Urrutia, Sebasti\'{a}n and dos Santos, Vinicius},
  title =	{{Finding the Minimum Cost Acceptable Element in a Sorted Matrix}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{28:1--28:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.28},
  URN =		{urn:nbn:de:0030-drops-203939},
  doi =		{10.4230/LIPIcs.SEA.2024.28},
  annote =	{Keywords: Search, Sorted matrix, Oracle function, Algorithm complexity}
}

Filters


Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail