DROPS

Document

DOI: 10.4230/LIPIcs.SEA.2023.7

Simple Runs-Bounded FM-Index Designs Are Fast

Authors: Diego Díaz-Domínguez, Saska Dönges, Simon J. Puglisi, and Leena Salmela

Published in: LIPIcs, Volume 265, 21st International Symposium on Experimental Algorithms (SEA 2023)

Abstract

Given a string X of length n on alphabet σ, the FM-index data structure allows counting all occurrences of a pattern P of length m in O(m) time via an algorithm called backward search. An important difficulty when searching with an FM-index is to support queries on L, the Burrows-Wheeler transform of X, while L is in compressed form. This problem has been the subject of intense research for 25 years now. Run-length encoding of L is an effective way to reduce index size, in particular when the data being indexed is highly-repetitive, which is the case in many types of modern data, including those arising from versioned document collections and in pangenomics. This paper takes a back-to-basics look at supporting backward search in FM-indexes, exploring and engineering two simple designs. The first divides the BWT string into blocks containing b symbols each and then run-length compresses each block separately, possibly introducing new runs (compared to applying run-length encoding once, to the whole string). Each block stores counts of each symbol that occurs before the block. This method supports the operation rank_c(L, i) (i.e., count the number of times c occurs in the prefix L[1..i]) by first determining the block i/b in which i falls and scanning the block to the appropriate position counting occurrences of c along the way. This partial answer to rank_c(L, i) is then added to the stored count of c symbols before the block to determine the final answer. Our second design has a similar structure, but instead divides the run-length-encoded version of L into blocks containing an equal number of runs. The trick then is to determine the block in which a query falls, which is achieved via a predecessor query over the block starting positions. We show via extensive experiments on a wide range of repetitive text collections that these FM-indexes are not only easy to implement, but also fast and space efficient in practice.

Cite as

Diego Díaz-Domínguez, Saska Dönges, Simon J. Puglisi, and Leena Salmela. Simple Runs-Bounded FM-Index Designs Are Fast. In 21st International Symposium on Experimental Algorithms (SEA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 265, pp. 7:1-7:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{diazdominguez_et_al:LIPIcs.SEA.2023.7,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego and D\"{o}nges, Saska and Puglisi, Simon J. and Salmela, Leena},
  title =	{{Simple Runs-Bounded FM-Index Designs Are Fast}},
  booktitle =	{21st International Symposium on Experimental Algorithms (SEA 2023)},
  pages =	{7:1--7:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-279-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{265},
  editor =	{Georgiadis, Loukas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2023.7},
  URN =		{urn:nbn:de:0030-drops-183579},
  doi =		{10.4230/LIPIcs.SEA.2023.7},
  annote =	{Keywords: data structures, efficient algorithms}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2023.59

On Higher Dimensional Point Sets in General Position

Authors: Andrew Suk and Ji Zeng

Published in: LIPIcs, Volume 258, 39th International Symposium on Computational Geometry (SoCG 2023)

Abstract

A finite point set in ℝ^d is in general position if no d + 1 points lie on a common hyperplane. Let α_d(N) be the largest integer such that any set of N points in ℝ^d with no d + 2 members on a common hyperplane, contains a subset of size α_d(N) in general position. Using the method of hypergraph containers, Balogh and Solymosi showed that α₂(N) < N^{5/6 + o(1)}. In this paper, we also use the container method to obtain new upper bounds for α_d(N) when d ≥ 3. More precisely, we show that if d is odd, then α_d(N) < N^{1/2 + 1/(2d) + o(1)}, and if d is even, we have α_d(N) < N^{1/2 + 1/(d-1) + o(1)}. We also study the classical problem of determining the maximum number a(d,k,n) of points selected from the grid [n]^d such that no k + 2 members lie on a k-flat. For fixed d and k, we show that a(d,k,n)≤ O(n^{d/{2⌊(k+2)/4⌋}(1- 1/{2⌊(k+2)/4⌋d+1})}), which improves the previously best known bound of O(n^{d/⌊(k + 2)/2⌋}) due to Lefmann when k+2 is congruent to 0 or 1 mod 4.

Cite as

Andrew Suk and Ji Zeng. On Higher Dimensional Point Sets in General Position. In 39th International Symposium on Computational Geometry (SoCG 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 258, pp. 59:1-59:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{suk_et_al:LIPIcs.SoCG.2023.59,
  author =	{Suk, Andrew and Zeng, Ji},
  title =	{{On Higher Dimensional Point Sets in General Position}},
  booktitle =	{39th International Symposium on Computational Geometry (SoCG 2023)},
  pages =	{59:1--59:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-273-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{258},
  editor =	{Chambers, Erin W. and Gudmundsson, Joachim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2023.59},
  URN =		{urn:nbn:de:0030-drops-179097},
  doi =		{10.4230/LIPIcs.SoCG.2023.59},
  annote =	{Keywords: independent sets, hypergraph container method, generalised Sidon sets}
}

Document

DOI: 10.4230/LIPIcs.WABI.2022.24

Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes

Authors: Diego P. Rubert and Marília D. V. Braga

Published in: LIPIcs, Volume 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)

Abstract

Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the similarities into account, the weighted rearrangement distance between the analyzed genomes (a problem that is NP-hard). The gene matchings are then integrated into gene families in the final step. Although the ILP is quite efficient and could conceptually analyze genomes that are not completely assembled but split in several contigs, our tool failed in completing that task. The main reason is that each ILP pairwise comparison includes an optimal capping that connects each end of a linear segment of one genome to an end of a linear segment in the other genome, producing an exponential increase of the search space. In this work, we design and implement a heuristic capping algorithm that replaces the optimal capping by clustering (based on their gene content intersections) the linear segments into m ≥ 1 subsets, whose ends are capped independently. Furthermore, in each subset, instead of allowing all possible connections, we let only the ends of content-related segments be connected. Although there is no guarantee that m is much bigger than one, and with the possible side effect of resulting in sub-optimal instead of optimal gene matchings, the heuristic works very well in practice, from both the speed performance and the quality of computed solutions. Our experiments on real data show that we can now efficiently analyze fruit fly genomes with unfinished assemblies distributed in hundreds or even thousands of contigs, obtaining orthologies that are more similar to FlyBase orthologies when compared to orthologies computed by other inference tools. Moreover, for complete assemblies the version with heuristic capping reports orthologies that are very similar to the orthologies computed by the optimal version of our tool. Our approach is implemented into a pipeline incorporating the pre-computation of gene similarities.

Cite as

Diego P. Rubert and Marília D. V. Braga. Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 242, pp. 24:1-24:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{rubert_et_al:LIPIcs.WABI.2022.24,
  author =	{Rubert, Diego P. and Braga, Mar{\'\i}lia D. V.},
  title =	{{Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes}},
  booktitle =	{22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)},
  pages =	{24:1--24:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-243-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{242},
  editor =	{Boucher, Christina and Rahmann, Sven},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2022.24},
  URN =		{urn:nbn:de:0030-drops-170586},
  doi =		{10.4230/LIPIcs.WABI.2022.24},
  annote =	{Keywords: Comparative genomics, double-cut-and-join, indels, gene orthology}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2022.49

Improved Approximation Algorithms for Dyck Edit Distance and RNA Folding

Authors: Debarati Das, Tomasz Kociumaka, and Barna Saha

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

The Dyck language, which consists of well-balanced sequences of parentheses, is one of the most fundamental context-free languages. The Dyck edit distance quantifies the number of edits (character insertions, deletions, and substitutions) required to make a given length-n parenthesis sequence well-balanced. RNA Folding involves a similar problem, where a closing parenthesis can match an opening parenthesis of the same type irrespective of their ordering. For example, in RNA Folding, both () and )( are valid matches, whereas the Dyck language only allows () as a match. Both of these problems have been studied extensively in the literature. Using fast matrix multiplication, it is possible to compute their exact solutions in time O(n^2.687) (Chi, Duan, Xie, Zhang, STOC'22), and a (1+ε)-multiplicative approximation is known with a running time of Ω(n^2.372). The impracticality of fast matrix multiplication often makes combinatorial algorithms much more desirable. Unfortunately, it is known that the problems of (exactly) computing the Dyck edit distance and the folding distance are at least as hard as Boolean matrix multiplication. Thereby, they are unlikely to admit truly subcubic-time combinatorial algorithms. In terms of fast approximation algorithms that are combinatorial in nature, the state of the art for Dyck edit distance is an O(log n)-factor approximation algorithm that runs in near-linear time (Saha, FOCS'14), whereas for RNA Folding only an ε n-additive approximation in Õ(n²/ε) time (Saha, FOCS'17) is known. In this paper, we make substantial improvements to the state of the art for Dyck edit distance (with any number of parenthesis types). We design a constant-factor approximation algorithm that runs in Õ(n^1.971) time (the first constant-factor approximation in subquadratic time). Moreover, we develop a (1+ε)-factor approximation algorithm running in Õ(n²/ε) time, which improves upon the earlier additive approximation. Finally, we design a (3+ε)-approximation that takes Õ(nd/ε) time, where d ≥ 1 is an upper bound on the sought distance. As for RNA folding, for any s ≥ 1, we design a factor-s approximation algorithm that runs in O(n+(n/s)³) time. To the best of our knowledge, this is the first nontrivial approximation algorithm for RNA Folding that can go below the n² barrier. All our algorithms are combinatorial in nature.

Cite as

Debarati Das, Tomasz Kociumaka, and Barna Saha. Improved Approximation Algorithms for Dyck Edit Distance and RNA Folding. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 49:1-49:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{das_et_al:LIPIcs.ICALP.2022.49,
  author =	{Das, Debarati and Kociumaka, Tomasz and Saha, Barna},
  title =	{{Improved Approximation Algorithms for Dyck Edit Distance and RNA Folding}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{49:1--49:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.49},
  URN =		{urn:nbn:de:0030-drops-163902},
  doi =		{10.4230/LIPIcs.ICALP.2022.49},
  annote =	{Keywords: Dyck Edit Distance, RNA Folding, String Algorithms}
}

Document

DOI: 10.4230/LIPIcs.CPM.2022.29

Efficient Construction of the BWT for Repetitive Text Using String Compression

Authors: Diego Díaz-Domínguez and Gonzalo Navarro

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)

Abstract

We present a new semi-external algorithm that builds the Burrows-Wheeler transform variant of Bauer et al. (a.k.a., BCR BWT) in linear expected time. Our method uses compression techniques to reduce the computational costs when the input is massive and repetitive. Concretely, we build on induced suffix sorting (ISS) and resort to run-length and grammar compression to maintain our intermediate results in compact form. Our compression format not only saves space, but it also speeds up the required computations. Our experiments show important savings in both space and computation time when the text is repetitive. On average, we are 3.7x faster than the baseline compressed approach, while maintaining a similar memory consumption. These results make our method stand out as the only one (to our knowledge) that can build the BCR BWT of a collection of 25 human genomes (75 GB) in about 7.3 hours, and using only 27 GB of working memory.

Cite as

Diego Díaz-Domínguez and Gonzalo Navarro. Efficient Construction of the BWT for Repetitive Text Using String Compression. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 29:1-29:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{diazdominguez_et_al:LIPIcs.CPM.2022.29,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego and Navarro, Gonzalo},
  title =	{{Efficient Construction of the BWT for Repetitive Text Using String Compression}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{29:1--29:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.29},
  URN =		{urn:nbn:de:0030-drops-161564},
  doi =		{10.4230/LIPIcs.CPM.2022.29},
  annote =	{Keywords: BWT, string compression, repetitive text}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2022.69

On Semi-Algebraic Proofs and Algorithms

Authors: Noah Fleming, Mika Göös, Stefan Grosser, and Robert Robere

Published in: LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)

Abstract

We give a new characterization of the Sherali-Adams proof system, showing that there is a degree-d Sherali-Adams refutation of an unsatisfiable CNF formula C if and only if there is an ε > 0 and a degree-d conical junta J such that viol_C(x) - ε = J, where viol_C(x) counts the number of falsified clauses of C on an input x. Using this result we show that the linear separation complexity, a complexity measure recently studied by Hrubeš (and independently by de Oliveira Oliveira and Pudlák under the name of weak monotone linear programming gates), monotone feasibly interpolates Sherali-Adams proofs. We then investigate separation results for viol_C(x) - ε. In particular, we give a family of unsatisfiable CNF formulas C which have polynomial-size and small-width resolution proofs, but for which any representation of viol_C(x) - 1 by a conical junta requires degree Ω(n); this resolves an open question of Filmus, Mahajan, Sood, and Vinyals. Since Sherali-Adams can simulate resolution, this separates the non-negative degree of viol_C(x) - 1 and viol_C(x) - ε for arbitrarily small ε > 0. Finally, by applying lifting theorems, we translate this lower bound into new separation results between extension complexity and monotone circuit complexity.

Cite as

Noah Fleming, Mika Göös, Stefan Grosser, and Robert Robere. On Semi-Algebraic Proofs and Algorithms. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 69:1-69:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{fleming_et_al:LIPIcs.ITCS.2022.69,
  author =	{Fleming, Noah and G\"{o}\"{o}s, Mika and Grosser, Stefan and Robere, Robert},
  title =	{{On Semi-Algebraic Proofs and Algorithms}},
  booktitle =	{13th Innovations in Theoretical Computer Science Conference (ITCS 2022)},
  pages =	{69:1--69:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-217-4},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{215},
  editor =	{Braverman, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2022.69},
  URN =		{urn:nbn:de:0030-drops-156658},
  doi =		{10.4230/LIPIcs.ITCS.2022.69},
  annote =	{Keywords: Proof Complexity, Extended Formulations, Circuit Complexity, Sherali-Adams}
}

Document

DOI: 10.4230/LIPIcs.IPEC.2021.3

The Fine-Grained Complexity of Multi-Dimensional Ordering Properties

Authors: Haozhe An, Mohit Gurumukhani, Russell Impagliazzo, Michael Jaber, Marvin Künnemann, and Maria Paula Parga Nina

Published in: LIPIcs, Volume 214, 16th International Symposium on Parameterized and Exact Computation (IPEC 2021)

Abstract

We define a class of problems whose input is an n-sized set of d-dimensional vectors, and where the problem is first-order definable using comparisons between coordinates. This class captures a wide variety of tasks, such as complex types of orthogonal range search, model-checking first-order properties on geometric intersection graphs, and elementary questions on multidimensional data like verifying Pareto optimality of a choice of data points. Focusing on constant dimension d, we show that any k-quantifier, d-dimensional such problem is solvable in O(n^{k-1} log^{d-1} n) time. Furthermore, this algorithm is conditionally tight up to subpolynomial factors: we show that assuming the 3-uniform hyperclique hypothesis, there is a k-quantifier, (3k-3)-dimensional problem in this class that requires time Ω(n^{k-1-o(1)}). Towards identifying a single representative problem for this class, we study the existence of complete problems for the 3-quantifier setting (since 2-quantifier problems can already be solved in near-linear time O(nlog^{d-1} n), and k-quantifier problems with k > 3 reduce to the 3-quantifier case). We define a problem Vector Concatenated Non-Domination VCND_d (Given three sets of vectors X,Y and Z of dimension d,d and 2d, respectively, is there an x ∈ X and a y ∈ Y so that their concatenation x∘y is not dominated by any z ∈ Z, where vector u is dominated by vector v if u_i ≤ v_i for each coordinate 1 ≤ i ≤ d), and determine it as the "unique" candidate to be complete for this class (under fine-grained assumptions).

Cite as

Haozhe An, Mohit Gurumukhani, Russell Impagliazzo, Michael Jaber, Marvin Künnemann, and Maria Paula Parga Nina. The Fine-Grained Complexity of Multi-Dimensional Ordering Properties. In 16th International Symposium on Parameterized and Exact Computation (IPEC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 214, pp. 3:1-3:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{an_et_al:LIPIcs.IPEC.2021.3,
  author =	{An, Haozhe and Gurumukhani, Mohit and Impagliazzo, Russell and Jaber, Michael and K\"{u}nnemann, Marvin and Nina, Maria Paula Parga},
  title =	{{The Fine-Grained Complexity of Multi-Dimensional Ordering Properties}},
  booktitle =	{16th International Symposium on Parameterized and Exact Computation (IPEC 2021)},
  pages =	{3:1--3:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-216-7},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{214},
  editor =	{Golovach, Petr A. and Zehavi, Meirav},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.IPEC.2021.3},
  URN =		{urn:nbn:de:0030-drops-153869},
  doi =		{10.4230/LIPIcs.IPEC.2021.3},
  annote =	{Keywords: Fine-grained complexity, First-order logic, Orthogonal vectors}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2021.44

Lifting for Constant-Depth Circuits and Applications to MCSP

Authors: Marco Carmosino, Kenneth Hoover, Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

Abstract

Lifting arguments show that the complexity of a function in one model is essentially that of a related function (often the composition of the original function with a small function called a gadget) in a more powerful model. Lifting has been used to prove strong lower bounds in communication complexity, proof complexity, circuit complexity and many other areas. We present a lifting construction for constant depth unbounded fan-in circuits. Given a function f, we construct a function g, so that the depth d+1 circuit complexity of g, with a certain restriction on bottom fan-in, is controlled by the depth d circuit complexity of f, with the same restriction. The function g is defined as f composed with a parity function. With some quantitative losses, average-case and general depth-d circuit complexity can be reduced to circuit complexity with this bottom fan-in restriction. As a consequence, an algorithm to approximate the depth d (for any d > 3) circuit complexity of given (truth tables of) Boolean functions yields an algorithm for approximating the depth 3 circuit complexity of functions, i.e., there are quasi-polynomial time mapping reductions between various gap-versions of AC⁰-MCSP. Our lifting results rely on a blockwise switching lemma that may be of independent interest. We also show some barriers on improving the efficiency of our reductions: such improvements would yield either surprisingly efficient algorithms for MCSP or stronger than known AC⁰ circuit lower bounds.

Cite as

Marco Carmosino, Kenneth Hoover, Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova. Lifting for Constant-Depth Circuits and Applications to MCSP. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 44:1-44:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{carmosino_et_al:LIPIcs.ICALP.2021.44,
  author =	{Carmosino, Marco and Hoover, Kenneth and Impagliazzo, Russell and Kabanets, Valentine and Kolokolova, Antonina},
  title =	{{Lifting for Constant-Depth Circuits and Applications to MCSP}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{44:1--44:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.44},
  URN =		{urn:nbn:de:0030-drops-141135},
  doi =		{10.4230/LIPIcs.ICALP.2021.44},
  annote =	{Keywords: circuit complexity, constant-depth circuits, lifting theorems, Minimum Circuit Size Problem, reductions, Switching Lemma}
}

@InProceedings{carmosino_et_al:LIPIcs.ICALP.2021.44,
  author =	{Carmosino, Marco and Hoover, Kenneth and Impagliazzo, Russell and Kabanets, Valentine and Kolokolova, Antonina},
  title =	{{Lifting for Constant-Depth Circuits and Applications to MCSP}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{44:1--44:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.44},
  URN =		{urn:nbn:de:0030-drops-141135},
  doi =		{10.4230/LIPIcs.ICALP.2021.44},
  annote =	{Keywords: circuit complexity, constant-depth circuits, lifting theorems, Minimum Circuit Size Problem, reductions, Switching Lemma}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2021.37

Sunflowers in Set Systems of Bounded Dimension

Authors: Jacob Fox, János Pach, and Andrew Suk

Published in: LIPIcs, Volume 189, 37th International Symposium on Computational Geometry (SoCG 2021)

Abstract

Given a family F of k-element sets, S₁,…,S_r ∈ F form an r-sunflower if S_i ∩ S_j = S_{i'} ∩ S_{j'} for all i ≠ j and i' ≠ j'. According to a famous conjecture of Erdős and Rado (1960), there is a constant c = c(r) such that if |F| ≥ c^k, then F contains an r-sunflower. We come close to proving this conjecture for families of bounded Vapnik-Chervonenkis dimension, VC-dim(F) ≤ d. In this case, we show that r-sunflowers exist under the slightly stronger assumption |F| ≥ 2^{10k(dr)^{2log^{*} k}}. Here, log^* denotes the iterated logarithm function. We also verify the Erdős-Rado conjecture for families F of bounded Littlestone dimension and for some geometrically defined set systems.

Cite as

Jacob Fox, János Pach, and Andrew Suk. Sunflowers in Set Systems of Bounded Dimension. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 37:1-37:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{fox_et_al:LIPIcs.SoCG.2021.37,
  author =	{Fox, Jacob and Pach, J\'{a}nos and Suk, Andrew},
  title =	{{Sunflowers in Set Systems of Bounded Dimension}},
  booktitle =	{37th International Symposium on Computational Geometry (SoCG 2021)},
  pages =	{37:1--37:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-184-9},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{189},
  editor =	{Buchin, Kevin and Colin de Verdi\`{e}re, \'{E}ric},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2021.37},
  URN =		{urn:nbn:de:0030-drops-138366},
  doi =		{10.4230/LIPIcs.SoCG.2021.37},
  annote =	{Keywords: Sunflower, VC-dimension, Littlestone dimension, pseudodisks}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.12

Maximum Coverage in the Data Stream Model: Parameterized and Generalized

Authors: Andrew McGregor, David Tench, and Hoa T. Vu

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

We present algorithms for the Max Coverage and Max Unique Coverage problems in the data stream model. The input to both problems are m subsets of a universe of size n and a value k ∈ [m]. In Max Coverage, the problem is to find a collection of at most k sets such that the number of elements covered by at least one set is maximized. In Max Unique Coverage, the problem is to find a collection of at most k sets such that the number of elements covered by exactly one set is maximized. These problems are closely related to a range of graph problems including matching, partial vertex cover, and capacitated maximum cut. In the data stream model, we assume k is given and the sets are revealed online. Our goal is to design single-pass algorithms that use space that is sublinear in the input size. Our main algorithmic results are: - If the sets have size at most d, there exist single-pass algorithms using O(d^{d+1} k^d) space that solve both problems exactly. This is optimal up to polylogarithmic factors for constant d. - If each element appears in at most r sets, we present single pass algorithms using Õ(k² r/ε³) space that return a 1+ε approximation in the case of Max Coverage. We also present a single-pass algorithm using slightly more memory, i.e., Õ(k³ r/ε⁴) space, that 1+ε approximates Max Unique Coverage. In contrast to the above results, when d and r are arbitrary, any constant pass 1+ε approximation algorithm for either problem requires Ω(ε^{-2}m) space but a single pass O(ε^{-2}mk) space algorithm exists. In fact any constant-pass algorithm with an approximation better than e/(e-1) and e^{1-1/k} for Max Coverage and Max Unique Coverage respectively requires Ω(m/k²) space when d and r are unrestricted. En route, we also obtain an algorithm for a parameterized version of the streaming Set Cover problem.

Cite as

Andrew McGregor, David Tench, and Hoa T. Vu. Maximum Coverage in the Data Stream Model: Parameterized and Generalized. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 12:1-12:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{mcgregor_et_al:LIPIcs.ICDT.2021.12,
  author =	{McGregor, Andrew and Tench, David and Vu, Hoa T.},
  title =	{{Maximum Coverage in the Data Stream Model: Parameterized and Generalized}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{12:1--12:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.12},
  URN =		{urn:nbn:de:0030-drops-137208},
  doi =		{10.4230/LIPIcs.ICDT.2021.12},
  annote =	{Keywords: Data streams, maximum coverage, maximum unique coverage, set cover}
}

Document

Artifact

DOI: 10.4230/DARTS.6.2.8

The Duality of Subtyping (Artifact)

Authors: Bruno C. d. S. Oliveira, Cui Shaobo, and Baber Rehman

Published in: DARTS, Volume 6, Issue 2, Special Issue of the 34th European Conference on Object-Oriented Programming (ECOOP 2020)

Abstract

This artifact contains the Coq formalization associated with the paper The Duality of Subtyping submitted in ECOOP 2020. This document explains how to run the Coq formalization. Artifact can either be compiled in the pre-built docker image with all the dependencies installed or it could be built from the scratch. Sections 1-7 explain the basic information about the artifact. Section A explains how to get the docker image for the artifact. Section B explains the prerequisites and the steps to run coq files from scratch. Section C explains coq files briefly. Section D shows the correspondence between important lemmas discussed in paper and their respective Coq formalization. The term MonoTyping used in artifact corresponds to the standard subtyping systems.

Cite as

Bruno C. d. S. Oliveira, Cui Shaobo, and Baber Rehman. The Duality of Subtyping (Artifact). In Special Issue of the 34th European Conference on Object-Oriented Programming (ECOOP 2020). Dagstuhl Artifacts Series (DARTS), Volume 6, Issue 2, pp. 8:1-8:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@Article{oliveira_et_al:DARTS.6.2.8,
  author =	{Oliveira, Bruno C. d. S. and Shaobo, Cui and Rehman, Baber},
  title =	{{The Duality of Subtyping (Artifact)}},
  pages =	{8:1--8:6},
  journal =	{Dagstuhl Artifacts Series},
  ISSN =	{2509-8195},
  year =	{2020},
  volume =	{6},
  number =	{2},
  editor =	{Oliveira, Bruno C. d. S. and Shaobo, Cui and Rehman, Baber},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DARTS.6.2.8},
  URN =		{urn:nbn:de:0030-drops-132051},
  doi =		{10.4230/DARTS.6.2.8},
  annote =	{Keywords: DuoTyping, OOP, Duality, Subtyping, Supertyping}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2020.29

The Duality of Subtyping

Authors: Bruno C. d. S. Oliveira, Cui Shaobo, and Baber Rehman

Published in: LIPIcs, Volume 166, 34th European Conference on Object-Oriented Programming (ECOOP 2020)

Abstract

Subtyping is a concept frequently encountered in many programming languages and calculi. Various forms of subtyping exist for different type system features, including intersection types, union types or bounded quantification. Normally these features are designed independently of each other, without exploiting obvious similarities (or dualities) between features. This paper proposes a novel methodology for designing subtyping relations that exploits duality between features. At the core of our methodology is a generalization of subtyping relations, which we call Duotyping. Duotyping is parameterized by the mode of the relation. One of these modes is the usual subtyping, while another mode is supertyping (the dual of subtyping). Using the mode it is possible to generalize the usual rules of subtyping to account not only for the intended behaviour of one particular language construct, but also of its dual. Duotyping brings multiple benefits, including: shorter specifications and implementations, dual features that come essentially for free, as well as new proof techniques for various properties of subtyping. To evaluate a design based on Duotyping against traditional designs, we formalized various calculi with common OOP features (including union types, intersection types and bounded quantification) in Coq in both styles. Our results show that the metatheory when using Duotyping does not come at a significant cost: the metatheory with Duotyping has similar complexity and size compared to the metatheory for traditional designs. However, we discover new features as duals to well-known features. Furthermore, we also show that Duotyping can significantly simplify transitivity proofs for many of the calculi studied by us.

Cite as

Bruno C. d. S. Oliveira, Cui Shaobo, and Baber Rehman. The Duality of Subtyping. In 34th European Conference on Object-Oriented Programming (ECOOP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 166, pp. 29:1-29:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{oliveira_et_al:LIPIcs.ECOOP.2020.29,
  author =	{Oliveira, Bruno C. d. S. and Shaobo, Cui and Rehman, Baber},
  title =	{{The Duality of Subtyping}},
  booktitle =	{34th European Conference on Object-Oriented Programming (ECOOP 2020)},
  pages =	{29:1--29:29},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-154-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{166},
  editor =	{Hirschfeld, Robert and Pape, Tobias},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2020.29},
  URN =		{urn:nbn:de:0030-drops-131864},
  doi =		{10.4230/LIPIcs.ECOOP.2020.29},
  annote =	{Keywords: DuoTyping, OOP, Duality, Subtyping, Supertyping}
}

Document

DOI: 10.4230/LIPIcs.DISC.2020.15

Distributed Dense Subgraph Detection and Low Outdegree Orientation

Authors: Hsin-Hao Su and Hoa T. Vu

Published in: LIPIcs, Volume 179, 34th International Symposium on Distributed Computing (DISC 2020)

Abstract

The densest subgraph problem, introduced in the 80s by Picard and Queyranne [Networks 1982] as well as Goldberg [Tech. Report 1984], is a classic problem in combinatorial optimization with a wide range of applications. The lowest outdegree orientation problem is known to be its dual problem. We study both the problem of finding dense subgraphs and the problem of computing a low outdegree orientation in the distributed settings. Suppose G = (V,E) is the underlying network as well as the input graph. Let D denote the density of the maximum density subgraph of G. Our main results are as follows. - Given a value D̃ ≤ D and 0 < ε < 1, we show that a subgraph with density at least (1-ε)D̃ can be identified deterministically in O((log n) / ε) rounds in the LOCAL model. We also present a lower bound showing that our result for the LOCAL model is tight up to an O(log n) factor. In the CONGEST~ model, we show that such a subgraph can be identified in O((log³ n) / ε³) rounds with high probability. Our techniques also lead to an O(diameter + (log⁴ n)/ε⁴)-round algorithm that yields a 1-ε approximation to the densest subgraph. This improves upon the previous O(diameter /ε ⋅ log n)-round algorithm by Das Sarma et al. [DISC 2012] that only yields a 1/2-ε approximation. - Given an integer D̃ ≥ D and Ω(1/D̃) < ε < 1/4, we give a deterministic, Õ((log² n) /ε²)-round algorithm in the CONGEST~ model that computes an orientation where the outdegree of every vertex is upper bounded by (1+ε)D̃. Previously, the best deterministic algorithm and randomized algorithm by Harris [FOCS 2019] run in Õ((log⁶ n)/ ε⁴) rounds and Õ((log³ n) /ε³) rounds respectively and only work in the LOCAL model.

Cite as

Hsin-Hao Su and Hoa T. Vu. Distributed Dense Subgraph Detection and Low Outdegree Orientation. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 15:1-15:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{su_et_al:LIPIcs.DISC.2020.15,
  author =	{Su, Hsin-Hao and Vu, Hoa T.},
  title =	{{Distributed Dense Subgraph Detection and Low Outdegree Orientation}},
  booktitle =	{34th International Symposium on Distributed Computing (DISC 2020)},
  pages =	{15:1--15:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-168-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{179},
  editor =	{Attiya, Hagit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2020.15},
  URN =		{urn:nbn:de:0030-drops-130938},
  doi =		{10.4230/LIPIcs.DISC.2020.15},
  annote =	{Keywords: Distributed Algorithms, Network Algorithms}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2020.36

Universality Problem for Unambiguous VASS

Authors: Wojciech Czerwiński, Diego Figueira, and Piotr Hofman

Published in: LIPIcs, Volume 171, 31st International Conference on Concurrency Theory (CONCUR 2020)

Abstract

We study languages of unambiguous VASS, that is, Vector Addition Systems with States, whose transitions read letters from a finite alphabet, and whose acceptance condition is defined by a set of final states (i.e., the coverability language). We show that the problem of universality for unambiguous VASS is ExpSpace-complete, in sheer contrast to Ackermann-completeness for arbitrary VASS, even in dimension 1. When the dimension d ∈ ℕ is fixed, the universality problem is PSpace-complete if d ≥ 2, and coNP-hard for 1-dimensional VASSes (also known as One Counter Nets).

Cite as

Wojciech Czerwiński, Diego Figueira, and Piotr Hofman. Universality Problem for Unambiguous VASS. In 31st International Conference on Concurrency Theory (CONCUR 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 171, pp. 36:1-36:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{czerwinski_et_al:LIPIcs.CONCUR.2020.36,
  author =	{Czerwi\'{n}ski, Wojciech and Figueira, Diego and Hofman, Piotr},
  title =	{{Universality Problem for Unambiguous VASS}},
  booktitle =	{31st International Conference on Concurrency Theory (CONCUR 2020)},
  pages =	{36:1--36:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-160-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{171},
  editor =	{Konnov, Igor and Kov\'{a}cs, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2020.36},
  URN =		{urn:nbn:de:0030-drops-128486},
  doi =		{10.4230/LIPIcs.CONCUR.2020.36},
  annote =	{Keywords: unambiguity, vector addition systems, universality problems}
}

Document

DOI: 10.4230/LIPIcs.WABI.2020.3

Natural Family-Free Genomic Distance

Authors: Diego P. Rubert, Fábio V. Martinez, and Marília D. V. Braga

Published in: LIPIcs, Volume 172, 20th International Workshop on Algorithms in Bioinformatics (WABI 2020)

Abstract

A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments of both genomes into families, more recently an alternative model was proposed, which, instead of family classification, simply uses the pairwise similarities between DNA fragments of both genomes to compute their rearrangement distance. This model represents structural rearrangements by the generic double cut and join (DCJ) operation and is then called family-free DCJ distance. It computes the DCJ distance between the two genomes by searching for a matching of their genes based on the given pairwise similarities, therefore helping to find gene homologies. The drawback is that its computation is NP-hard. Another point is that the family-free DCJ distance must correspond to a maximal matching of the genes, due to the fact that unmatched genes are just ignored: maximizing the matching prevents the free lunch artifact of having empty or almost empty matchings giving the smaller distances. In this paper, besides DCJ operations, we allow content-modifying operations of insertions and deletions of DNA segments and propose a new and more general family-free genomic distance. In our model we use the pairwise similarities to assign weights to both matched and unmatched genes, so that an optimal solution does not necessarily maximize the matching. Our model then results in a natural family-free genomic distance, that takes into consideration all given genes and has a search space composed of matchings of any size. We provide an efficient ILP formulation to solve it, by extending the previous formulations for computing family-based genomic distances from Shao et al. (J. Comput. Biol., 2015) and Bohnenkämper et al. (Proc. of RECOMB, 2020). Our experiments show that the ILP can handle not only bacterial genomes, but also fungi and insects, or sets of chromosomes of mammals and plants. In a comparison study of six fruit fly genomes, we obtained accurate results.

Cite as

Diego P. Rubert, Fábio V. Martinez, and Marília D. V. Braga. Natural Family-Free Genomic Distance. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 172, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{rubert_et_al:LIPIcs.WABI.2020.3,
  author =	{Rubert, Diego P. and Martinez, F\'{a}bio V. and Braga, Mar{\'\i}lia D. V.},
  title =	{{Natural Family-Free Genomic Distance}},
  booktitle =	{20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
  pages =	{3:1--3:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-161-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{172},
  editor =	{Kingsford, Carl and Pisanti, Nadia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2020.3},
  URN =		{urn:nbn:de:0030-drops-127926},
  doi =		{10.4230/LIPIcs.WABI.2020.3},
  annote =	{Keywords: Comparative genomics, Genome rearrangement, DCJ-indel distance}
}

28 Search Results for "D�az-Dom�nguez, Diego"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message