51 Search Results for "Grossi, Roberto"


Volume

LIPIcs, Volume 54

27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

CPM 2016, June 27-29, 2016, Tel Aviv, Israel

Editors: Roberto Grossi and Moshe Lewenstein

Document
A Compact DAG for Storing and Searching Maximal Common Subsequences

Authors: Alessio Conte, Roberto Grossi, Giulia Punzi, and Takeaki Uno

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)


Abstract
Maximal Common Subsequences (MCSs) between two strings X and Y are subsequences of both X and Y that are maximal under inclusion. MCSs relax and generalize the well known and widely used concept of Longest Common Subsequences (LCSs), which can be seen as MCSs of maximum length. While the number both LCSs and MCSs can be exponential in the length of the strings, LCSs have been long exploited for string and text analysis, as simple compact representations of all LCSs between two strings, built via dynamic programming or automata, have been known since the '70s. MCSs appear to have a more challenging structure: even listing them efficiently was an open problem open until recently, thus narrowing the complexity difference between the two problems, but the gap remained significant. In this paper we close the complexity gap: we show how to build DAG of polynomial size - in polynomial time - which allows for efficient operations on the set of all MCSs such as enumeration in Constant Amortized Time per solution (CAT), counting, and random access to the i-th element (i.e., rank and select operations). Other than improving known algorithmic results, this work paves the way for new sequence analysis methods based on MCSs.

Cite as

Alessio Conte, Roberto Grossi, Giulia Punzi, and Takeaki Uno. A Compact DAG for Storing and Searching Maximal Common Subsequences. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 21:1-21:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:LIPIcs.ISAAC.2023.21,
  author =	{Conte, Alessio and Grossi, Roberto and Punzi, Giulia and Uno, Takeaki},
  title =	{{A Compact DAG for Storing and Searching Maximal Common Subsequences}},
  booktitle =	{34th International Symposium on Algorithms and Computation (ISAAC 2023)},
  pages =	{21:1--21:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-289-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{283},
  editor =	{Iwata, Satoru and Kakimura, Naonori},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.21},
  URN =		{urn:nbn:de:0030-drops-193231},
  doi =		{10.4230/LIPIcs.ISAAC.2023.21},
  annote =	{Keywords: Maximal common subsequence, DAG, Compact data structures, Enumeration, Constant amortized time, Random access}
}
Document
phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering

Authors: Veronica Guerrini, Alessio Conte, Roberto Grossi, Gianni Liti, Giovanna Rosone, and Lorenzo Tattini

Published in: LIPIcs, Volume 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)


Abstract
Molecular phylogenetics is a fundamental branch of biology. It studies the evolutionary relationships among the individuals of a population through their biological sequences, and may provide insights about the origin and the evolution of viral diseases, or highlight complex evolutionary trajectories. In this paper we develop a method called phyBWT, describing how to use the extended Burrows-Wheeler Transform (eBWT) for a collection of DNA sequences to directly reconstruct phylogeny, bypassing the alignment against a reference genome or de novo assembly. Our phyBWT hinges on the combinatorial properties of the eBWT positional clustering framework. We employ eBWT to detect relevant blocks of the longest shared substrings of varying length (unlike the k-mer-based approaches that need to fix the length k a priori), and build a suitable decomposition leading to a phylogenetic tree, step by step. As a result, phyBWT is a new alignment-, assembly-, and reference-free method that builds a partition tree without relying on the pairwise comparison of sequences, thus avoiding to use a distance matrix to infer phylogeny. The preliminary experimental results on sequencing data show that our method can handle datasets of different types (short reads, contigs, or entire genomes), producing trees of quality comparable to that found in the benchmark phylogeny.

Cite as

Veronica Guerrini, Alessio Conte, Roberto Grossi, Gianni Liti, Giovanna Rosone, and Lorenzo Tattini. phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 242, pp. 23:1-23:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{guerrini_et_al:LIPIcs.WABI.2022.23,
  author =	{Guerrini, Veronica and Conte, Alessio and Grossi, Roberto and Liti, Gianni and Rosone, Giovanna and Tattini, Lorenzo},
  title =	{{phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering}},
  booktitle =	{22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)},
  pages =	{23:1--23:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-243-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{242},
  editor =	{Boucher, Christina and Rahmann, Sven},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2022.23},
  URN =		{urn:nbn:de:0030-drops-170577},
  doi =		{10.4230/LIPIcs.WABI.2022.23},
  annote =	{Keywords: Phylogeny, partition tree, BWT, positional cluster, alignment-free, reference-free, assembly-free}
}
Document
Invited Talk
Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk)

Authors: Sharma V. Thankachan

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
In the past two decades, we have witnessed the design of various compact data structures for pattern matching over an indexed text [Navarro, 2016]. Popular indexes like the FM-index [Paolo Ferragina and Giovanni Manzini, 2005], compressed suffix arrays/trees [Roberto Grossi and Jeffrey Scott Vitter, 2005; Kunihiko Sadakane, 2007], the recent r-index [Travis Gagie et al., 2020; Takaaki Nishimoto and Yasuo Tabei, 2021], etc., capture the key functionalities of classic suffix arrays/trees [Udi Manber and Eugene W. Myers, 1993; Peter Weiner, 1973] in compact space. Mostly, they rely on the Burrows-Wheeler Transform (BWT) and its associated operations [Burrows and Wheeler, 1994]. However, compactly encoding some advanced suffix tree (ST) variants, like parameterized ST [Brenda S. Baker, 1993; S. Rao Kosaraju, 1995; Juan Mendivelso et al., 2020], order-isomorphic/preserving ST [Maxime Crochemore et al., 2016], two-dimensional ST [Raffaele Giancarlo, 1995; Dong Kyue Kim et al., 1998], etc. [Sung Gwan Park et al., 2019; Tetsuo Shibuya, 2000]- collectively known as suffix trees with missing suffix links [Richard Cole and Ramesh Hariharan, 2003], has been challenging. The previous techniques are not easily extendable because these variants do not hold some structural properties of the standard ST that enable compression. However, some limited progress has been made in these directions recently [Arnab Ganguly et al., 2017; Travis Gagie et al., 2017; Gianni Decaroli et al., 2017; Dhrumil Patel and Rahul Shah, 2021; Arnab Ganguly et al., 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Arnab Ganguly et al., 2017; Arnab Ganguly et al., 2022; Arnab Ganguly et al., 2021]. This talk will briefly survey them and highlight some interesting open problems.

Cite as

Sharma V. Thankachan. Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk). In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 3:1-3:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{thankachan:LIPIcs.CPM.2022.3,
  author =	{Thankachan, Sharma V.},
  title =	{{Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc.}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{3:1--3:3},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.3},
  URN =		{urn:nbn:de:0030-drops-161300},
  doi =		{10.4230/LIPIcs.CPM.2022.3},
  annote =	{Keywords: Text Indexing, Suffix Trees, String Matching}
}
Document
On Strings Having the Same Length- k Substrings

Authors: Giulia Bernardini, Alessio Conte, Esteban Gabory, Roberto Grossi, Grigorios Loukides, Solon P. Pissis, Giulia Punzi, and Michelle Sweering

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
Let Substr_k(X) denote the set of length-k substrings of a given string X for a given integer k > 0. We study the following basic string problem, called z-Shortest 𝒮_k-Equivalent Strings: Given a set 𝒮_k of n length-k strings and an integer z > 0, list z shortest distinct strings T₁,…,T_z such that Substr_k(T_i) = 𝒮_k, for all i ∈ [1,z]. The z-Shortest 𝒮_k-Equivalent Strings problem arises naturally as an encoding problem in many real-world applications; e.g., in data privacy, in data compression, and in bioinformatics. The 1-Shortest 𝒮_k-Equivalent Strings, referred to as Shortest 𝒮_k-Equivalent String, asks for a shortest string X such that Substr_k(X) = 𝒮_k. Our main contributions are summarized below: - Given a directed graph G(V,E), the Directed Chinese Postman (DCP) problem asks for a shortest closed walk that visits every edge of G at least once. DCP can be solved in 𝒪̃(|E||V|) time using an algorithm for min-cost flow. We show, via a non-trivial reduction, that if Shortest 𝒮_k-Equivalent String over a binary alphabet has a near-linear-time solution then so does DCP. - We show that the length of a shortest string output by Shortest 𝒮_k-Equivalent String is in 𝒪(k+n²). We generalize this bound by showing that the total length of z shortest strings is in 𝒪(zk+zn²+z²n). We derive these upper bounds by showing (asymptotically tight) bounds on the total length of z shortest Eulerian walks in general directed graphs. - We present an algorithm for solving z-Shortest 𝒮_k-Equivalent Strings in 𝒪(nk+n²log²n+zn²log n+|output|) time. If z = 1, the time becomes 𝒪(nk+n²log²n) by the fact that the size of the input is Θ(nk) and the size of the output is 𝒪(k+n²).

Cite as

Giulia Bernardini, Alessio Conte, Esteban Gabory, Roberto Grossi, Grigorios Loukides, Solon P. Pissis, Giulia Punzi, and Michelle Sweering. On Strings Having the Same Length- k Substrings. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 16:1-16:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{bernardini_et_al:LIPIcs.CPM.2022.16,
  author =	{Bernardini, Giulia and Conte, Alessio and Gabory, Esteban and Grossi, Roberto and Loukides, Grigorios and Pissis, Solon P. and Punzi, Giulia and Sweering, Michelle},
  title =	{{On Strings Having the Same Length- k Substrings}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{16:1--16:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.16},
  URN =		{urn:nbn:de:0030-drops-161439},
  doi =		{10.4230/LIPIcs.CPM.2022.16},
  annote =	{Keywords: string algorithms, combinatorics on words, de Bruijn graph, Chinese Postman}
}
Document
Inverse Suffix Array Queries for 2-Dimensional Pattern Matching in Near-Compact Space

Authors: Dhrumil Patel and Rahul Shah

Published in: LIPIcs, Volume 212, 32nd International Symposium on Algorithms and Computation (ISAAC 2021)


Abstract
In a 2-dimensional (2D) pattern matching problem, the text is arranged as a matrix 𝖬[1..n, 1..n] and consists of N = n × n symbols drawn from alphabet set Σ of size σ. The query consists of a m × m square matrix 𝖯[1..m, 1..m] drawn from the same alphabet set Σ and the task is to find all the locations in 𝖬 where 𝖯 appears as a (contiguous) submatrix. The patterns can be of any size, but as long as they are square in shape data structures like suffix trees and suffix array exist [Raffaele Giancarlo, 1995; Dong Kyue Kim et al., 1998] for the task of efficient pattern matching. These are essentially 2D counterparts of classic suffix trees and arrays known for traditional 1-dimensional (1D) pattern matching. They work based on linearization of 2D suffixes which would preserve the prefix match property (i.e., every pattern match is a prefix of some suffix). The main limitation of the suffix trees and the suffix arrays (in 1D) was their space utilization of O(N log N) bits, where N is the size of the text. This was suboptimal compared to Nlog σ bits of space, which is information theoretic optimal for the text. With the advent of the field of succinct/compressed data structures, it was possible to develop compressed variants of suffix trees and array based on Burrows-Wheeler Tansform and LF-mapping (or Φ function) [Roberto Grossi and Jeffrey Scott Vitter, 2005; Paolo Ferragina and Giovanni Manzini, 2005; Kunihiko Sadakane, 2007]. These data structures indeed achieve O(N log σ) bits of space or better. This gives rise to the question: analogous to 1D case, can we design a succinct or compressed index for 2D pattern matching? Can there be a 2D compressed suffix tree? Are there analogues of Burrows-Wheeler Transform or LF-mapping? The problem has been acknowledged for over a decade now and there have been a few attempts at applying Φ function [Ankur Gupta, 2004] and achieving entropy based compression [Veli Mäkinen and Gonzalo Navarro, 2008]. However, achieving the complexity breakthrough akin to 1D case has yet to be found. In this paper, we still do not know how to answer suffix array queries in O(N log σ) bits of space - which would have led to efficient pattern matching. However, for the first time, we show an interesting result that it is indeed possible to compute inverse suffix array (ISA) queries in near compact space in O(polylog n) time. Our 2D succinct text index design is based on two 1D compressed suffix trees and it takes O(N log log N + N logσ) bits of space which is much smaller than its naive design that takes O(N log N) bits. Although the main problem is still evasive, this index gives a hope on the existence of a full 2D succinct index with all functionalities similar to that of 1D case.

Cite as

Dhrumil Patel and Rahul Shah. Inverse Suffix Array Queries for 2-Dimensional Pattern Matching in Near-Compact Space. In 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 212, pp. 60:1-60:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{patel_et_al:LIPIcs.ISAAC.2021.60,
  author =	{Patel, Dhrumil and Shah, Rahul},
  title =	{{Inverse Suffix Array Queries for 2-Dimensional Pattern Matching in Near-Compact Space}},
  booktitle =	{32nd International Symposium on Algorithms and Computation (ISAAC 2021)},
  pages =	{60:1--60:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-214-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{212},
  editor =	{Ahn, Hee-Kap and Sadakane, Kunihiko},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2021.60},
  URN =		{urn:nbn:de:0030-drops-154932},
  doi =		{10.4230/LIPIcs.ISAAC.2021.60},
  annote =	{Keywords: Pattern Matching, Succinct Data Structures}
}
Document
On Computing the Diameter of (Weighted) Link Streams

Authors: Marco Calamai, Pierluigi Crescenzi, and Andrea Marino

Published in: LIPIcs, Volume 190, 19th International Symposium on Experimental Algorithms (SEA 2021)


Abstract
A weighted link stream is a pair (V,𝔼) comprising V, the set of nodes, and 𝔼, the list of temporal edges (u,v,t,λ), where u,v are two nodes in V, t is the starting time of the temporal edge, and λ is its travel time. By making use of this model, different notions of diameter can be defined, which refer to the following distances: earliest arrival time, latest departure time, fastest time, and shortest time. After proving that any of these diameters cannot be computed in time sub-quadratic with respect to the number of temporal edges, we propose different algorithms (inspired by the approach used for computing the diameter of graphs) which allow us to compute, in practice very efficiently, the diameter of quite large real-world weighted link stream for several definitions of the diameter. Indeed, all the proposed algorithms require very often a very low number of single source (or target) best path computations. We verify the effectiveness of our approach by means of an extensive set of experiments on real-world link streams. We also experimentally prove that the temporal version of the well-known 2-sweep technique, for computing a lower bound on the diameter of a graph, is quite effective in the case of weighted link stream, by returning very often tight bounds.

Cite as

Marco Calamai, Pierluigi Crescenzi, and Andrea Marino. On Computing the Diameter of (Weighted) Link Streams. In 19th International Symposium on Experimental Algorithms (SEA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 190, pp. 11:1-11:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{calamai_et_al:LIPIcs.SEA.2021.11,
  author =	{Calamai, Marco and Crescenzi, Pierluigi and Marino, Andrea},
  title =	{{On Computing the Diameter of (Weighted) Link Streams}},
  booktitle =	{19th International Symposium on Experimental Algorithms (SEA 2021)},
  pages =	{11:1--11:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-185-6},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{190},
  editor =	{Coudert, David and Natale, Emanuele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2021.11},
  URN =		{urn:nbn:de:0030-drops-137836},
  doi =		{10.4230/LIPIcs.SEA.2021.11},
  annote =	{Keywords: Temporal graph, shortest path, diameter}
}
Document
Finding Structurally and Temporally Similar Trajectories in Graphs

Authors: Roberto Grossi, Andrea Marino, and Shima Moghtasedi

Published in: LIPIcs, Volume 160, 18th International Symposium on Experimental Algorithms (SEA 2020)


Abstract
The analysis of similar motions in a network provides useful information for different applications like route recommendation. We are interested in algorithms to efficiently retrieve trajectories that are similar to a given query trajectory. For this task many studies have focused on extracting the geometrical information of trajectories. In this paper we investigate the properties of trajectories moving along the paths of a network. We provide a similarity function by making use of both the temporal aspect of trajectories and the structure of the underlying network. We propose an approximation technique that offers the top-k similar trajectories with respect to a query trajectory in an efficient way with acceptable precision. We investigate our method over real-world networks, and our experimental results show the effectiveness of the proposed method.

Cite as

Roberto Grossi, Andrea Marino, and Shima Moghtasedi. Finding Structurally and Temporally Similar Trajectories in Graphs. In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 24:1-24:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{grossi_et_al:LIPIcs.SEA.2020.24,
  author =	{Grossi, Roberto and Marino, Andrea and Moghtasedi, Shima},
  title =	{{Finding Structurally and Temporally Similar Trajectories in Graphs}},
  booktitle =	{18th International Symposium on Experimental Algorithms (SEA 2020)},
  pages =	{24:1--24:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-148-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{160},
  editor =	{Faro, Simone and Cantone, Domenico},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2020.24},
  URN =		{urn:nbn:de:0030-drops-120989},
  doi =		{10.4230/LIPIcs.SEA.2020.24},
  annote =	{Keywords: Graph trajectory, approximated similarity, top-k similarity query}
}
Document
Finding the Anticover of a String

Authors: Mai Alzamel, Alessio Conte, Shuhei Denzumi, Roberto Grossi, Costas S. Iliopoulos, Kazuhiro Kurita, and Kunihiro Wasa

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
A k-anticover of a string x is a set of pairwise distinct factors of x of equal length k, such that every symbol of x is contained into an occurrence of at least one of those factors. The existence of a k-anticover can be seen as a notion of non-redundancy, which has application in computational biology, where they are associated with various non-regulatory mechanisms. In this paper we address the complexity of the problem of finding a k-anticover of a string x if it exists, showing that the decision problem is NP-complete on general strings for k ≥ 3. We also show that the problem admits a polynomial-time solution for k=2. For unbounded k, we provide an exact exponential algorithm to find a k-anticover of a string of length n (or determine that none exists), which runs in O*(min {3^{(n-k)/3)}, ((k(k+1))/2)^{n/(k+1)) time using polynomial space.

Cite as

Mai Alzamel, Alessio Conte, Shuhei Denzumi, Roberto Grossi, Costas S. Iliopoulos, Kazuhiro Kurita, and Kunihiro Wasa. Finding the Anticover of a String. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 2:1-2:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{alzamel_et_al:LIPIcs.CPM.2020.2,
  author =	{Alzamel, Mai and Conte, Alessio and Denzumi, Shuhei and Grossi, Roberto and Iliopoulos, Costas S. and Kurita, Kazuhiro and Wasa, Kunihiro},
  title =	{{Finding the Anticover of a String}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{2:1--2:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.2},
  URN =		{urn:nbn:de:0030-drops-121270},
  doi =		{10.4230/LIPIcs.CPM.2020.2},
  annote =	{Keywords: Anticover, String algorithms, Stringology, NP-complete}
}
Document
Listing Induced Steiner Subgraphs as a Compact Way to Discover Steiner Trees in Graphs

Authors: Alessio Conte, Roberto Grossi, Mamadou Moustapha Kanté, Andrea Marino, Takeaki Uno, and Kunihiro Wasa

Published in: LIPIcs, Volume 138, 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019)


Abstract
This paper investigates induced Steiner subgraphs as a variant of the classical Steiner trees, so as to compactly represent the (exponentially many) Steiner trees sharing the same underlying induced subgraph. We prove that the enumeration of all (inclusion-minimal) induced Steiner subgraphs is harder than the well-known Hypergraph Transversal enumeration problem if the number of terminals is not fixed. When the number of terminals is fixed, we propose a polynomial delay algorithm for listing all induced Steiner subgraphs of minimum size. We also propose a polynomial delay algorithm for listing the set of minimal induced Steiner subgraphs when the number of terminals is 3.

Cite as

Alessio Conte, Roberto Grossi, Mamadou Moustapha Kanté, Andrea Marino, Takeaki Uno, and Kunihiro Wasa. Listing Induced Steiner Subgraphs as a Compact Way to Discover Steiner Trees in Graphs. In 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 138, pp. 73:1-73:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:LIPIcs.MFCS.2019.73,
  author =	{Conte, Alessio and Grossi, Roberto and Kant\'{e}, Mamadou Moustapha and Marino, Andrea and Uno, Takeaki and Wasa, Kunihiro},
  title =	{{Listing Induced Steiner Subgraphs as a Compact Way to Discover Steiner Trees in Graphs}},
  booktitle =	{44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019)},
  pages =	{73:1--73:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-117-7},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{138},
  editor =	{Rossmanith, Peter and Heggernes, Pinar and Katoen, Joost-Pieter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2019.73},
  URN =		{urn:nbn:de:0030-drops-110174},
  doi =		{10.4230/LIPIcs.MFCS.2019.73},
  annote =	{Keywords: Graph algorithms, enumeration, listing and counting, Steiner trees, induced subgraphs}
}
Document
Track A: Algorithms, Complexity and Games
On the Complexity of String Matching for Graphs

Authors: Massimo Equi, Roberto Grossi, Veli Mäkinen, and Alexandru I. Tomescu

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
Exact string matching in labeled graphs is the problem of searching paths of a graph G=(V,E) such that the concatenation of their node labels is equal to the given pattern string P[1..m]. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, of query operations in graph databases, and of analysis operations in heterogeneous networks. We prove a conditional lower bound stating that, for any constant epsilon>0, an O(|E|^{1 - epsilon} m)-time, or an O(|E| m^{1 - epsilon})-time algorithm for exact string matching in graphs, with node labels and patterns drawn from a binary alphabet, cannot be achieved unless the Strong Exponential Time Hypothesis (SETH) is false. This holds even if restricted to undirected graphs with maximum node degree two, i.e. to zig-zag matching in bidirectional strings, or to deterministic directed acyclic graphs whose nodes have maximum sum of indegree and outdegree three. These restricted cases make the lower bound stricter than what can be directly derived from related bounds on regular expression matching (Backurs and Indyk, FOCS'16). In fact, our bounds are tight in the sense that lowering the degree or the alphabet size yields linear-time solvable problems. An interesting corollary is that exact and approximate matching are equally hard (quadratic time) in graphs under SETH. In comparison, the same problems restricted to strings have linear-time vs quadratic-time solutions, respectively (approximate pattern matching having also a matching SETH lower bound (Backurs and Indyk, STOC'15)).

Cite as

Massimo Equi, Roberto Grossi, Veli Mäkinen, and Alexandru I. Tomescu. On the Complexity of String Matching for Graphs. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 55:1-55:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{equi_et_al:LIPIcs.ICALP.2019.55,
  author =	{Equi, Massimo and Grossi, Roberto and M\"{a}kinen, Veli and Tomescu, Alexandru I.},
  title =	{{On the Complexity of String Matching for Graphs}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{55:1--55:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.55},
  URN =		{urn:nbn:de:0030-drops-106314},
  doi =		{10.4230/LIPIcs.ICALP.2019.55},
  annote =	{Keywords: exact pattern matching, graph query, graph search, labeled graphs, string matching, string search, strong exponential time hypothesis, heterogeneous networks, variation graphs}
}
Document
Listing Subgraphs by Cartesian Decomposition

Authors: Alessio Conte, Roberto Grossi, Andrea Marino, Romeo Rizzi, and Luca Versari

Published in: LIPIcs, Volume 117, 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)


Abstract
We investigate a decomposition technique for listing problems in graphs and set systems. It is based on the Cartesian product of some iterators, which list the solutions of simpler problems. Our ideas applies to several problems, and we illustrate one of them in depth, namely, listing all minimum spanning trees of a weighted graph G. Here iterators over the spanning trees for unweighted graphs can be obtained by a suitable modification of the listing algorithm by [Shioura et al., SICOMP 1997], and the decomposition of G is obtained by suitably partitioning its edges according to their weights. By combining these iterators in a Cartesian product scheme that employs Gray coding, we give the first algorithm which lists all minimum spanning trees of G in constant delay, where the delay is the time elapsed between any two consecutive outputs. Our solution requires polynomial preprocessing time and uses polynomial space.

Cite as

Alessio Conte, Roberto Grossi, Andrea Marino, Romeo Rizzi, and Luca Versari. Listing Subgraphs by Cartesian Decomposition. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 117, pp. 84:1-84:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:LIPIcs.MFCS.2018.84,
  author =	{Conte, Alessio and Grossi, Roberto and Marino, Andrea and Rizzi, Romeo and Versari, Luca},
  title =	{{Listing Subgraphs by Cartesian Decomposition}},
  booktitle =	{43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)},
  pages =	{84:1--84:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-086-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{117},
  editor =	{Potapov, Igor and Spirakis, Paul and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2018.84},
  URN =		{urn:nbn:de:0030-drops-96666},
  doi =		{10.4230/LIPIcs.MFCS.2018.84},
  annote =	{Keywords: Graph algorithms, listing, minimum spanning trees, constant delay}
}
Document
Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables

Authors: Roberto Grossi and Luca Versari

Published in: LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)


Abstract
This paper proposes round-hashing, which is suitable for data storage on distributed servers and for implementing external-memory tables in which each lookup retrieves at most one single block of external memory, using a stash. For data storage, round-hashing is like consistent hashing as it avoids a full rehashing of the keys when new servers are added. Experiments show that the speed to serve requests is tenfold or more than the state of the art. In distributed data storage, this guarantees better throughput for serving requests and, moreover, greatly reduces decision times for which data should move to new servers as rescanning data is much faster.

Cite as

Roberto Grossi and Luca Versari. Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 43:1-43:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{grossi_et_al:LIPIcs.ESA.2018.43,
  author =	{Grossi, Roberto and Versari, Luca},
  title =	{{Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables}},
  booktitle =	{26th Annual European Symposium on Algorithms (ESA 2018)},
  pages =	{43:1--43:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-081-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{112},
  editor =	{Azar, Yossi and Bast, Hannah and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.43},
  URN =		{urn:nbn:de:0030-drops-95062},
  doi =		{10.4230/LIPIcs.ESA.2018.43},
  annote =	{Keywords: consistent hashing, external memory, hash tables}
}
Document
Degenerate String Comparison and Applications

Authors: Mai Alzamel, Lorraine A. K. Ayad, Giulia Bernardini, Roberto Grossi, Costas S. Iliopoulos, Nadia Pisanti, Solon P. Pissis, and Giovanna Rosone

Published in: LIPIcs, Volume 113, 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)


Abstract
A generalised degenerate string (GD string) S^ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length k_i but this length can vary between different sets. We denote the sum of these lengths k_0, k_1,...,k_{n-1} by W. This type of uncertain sequence can represent, for example, a gapless multiple sequence alignment of width W in a compact form. Our first result in this paper is an O(N+M)-time algorithm for deciding whether the intersection of two GD strings of total sizes N and M, respectively, over an integer alphabet, is non-empty. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in only linear space. A similar result can be obtained by employing an automata-based approach but its cost is alphabet-dependent. We then apply our string comparison algorithm to compute palindromes in GD strings. We present an O(min{W,n^2}N)-time algorithm for computing all palindromes in S^. Furthermore, we show a similar conditional lower bound for computing maximal palindromes in S^. Finally, proof-of-concept experimental results are presented using real protein datasets.

Cite as

Mai Alzamel, Lorraine A. K. Ayad, Giulia Bernardini, Roberto Grossi, Costas S. Iliopoulos, Nadia Pisanti, Solon P. Pissis, and Giovanna Rosone. Degenerate String Comparison and Applications. In 18th International Workshop on Algorithms in Bioinformatics (WABI 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 113, pp. 21:1-21:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{alzamel_et_al:LIPIcs.WABI.2018.21,
  author =	{Alzamel, Mai and Ayad, Lorraine A. K. and Bernardini, Giulia and Grossi, Roberto and Iliopoulos, Costas S. and Pisanti, Nadia and Pissis, Solon P. and Rosone, Giovanna},
  title =	{{Degenerate String Comparison and Applications}},
  booktitle =	{18th International Workshop on Algorithms in Bioinformatics (WABI 2018)},
  pages =	{21:1--21:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-082-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{113},
  editor =	{Parida, Laxmi and Ukkonen, Esko},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2018.21},
  URN =		{urn:nbn:de:0030-drops-93236},
  doi =		{10.4230/LIPIcs.WABI.2018.21},
  annote =	{Keywords: degenerate strings, generalised degenerate strings, elastic-degenerate strings, string comparison, palindromes}
}
Document
On-Line Pattern Matching on Similar Texts

Authors: Roberto Grossi, Costas S. Iliopoulos, Chang Liu, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, Giovanna Rosone, Fatima Vayani, and Luca Versari

Published in: LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)


Abstract
Pattern matching on a set of similar texts has received much attention, especially recently, mainly due to its application in cataloguing human genetic variation. In particular, many different algorithms have been proposed for the off-line version of this problem; that is, constructing a compressed index for a set of similar texts in order to answer pattern matching queries efficiently. However, the on-line, more fundamental, version of this problem is a rather undeveloped topic. Solutions to the on-line version can be beneficial for a number of reasons; for instance, efficient on-line solutions can be used in combination with partial indexes as practical trade-offs. We make here an attempt to close this gap via proposing two efficient algorithms for this problem. Notably, one of the algorithms requires time linear in the size of the texts' representation, for short patterns. Furthermore, experimental results confirm our theoretical findings in practical terms.

Cite as

Roberto Grossi, Costas S. Iliopoulos, Chang Liu, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, Giovanna Rosone, Fatima Vayani, and Luca Versari. On-Line Pattern Matching on Similar Texts. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 9:1-9:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{grossi_et_al:LIPIcs.CPM.2017.9,
  author =	{Grossi, Roberto and Iliopoulos, Costas S. and Liu, Chang and Pisanti, Nadia and Pissis, Solon P. and Retha, Ahmad and Rosone, Giovanna and Vayani, Fatima and Versari, Luca},
  title =	{{On-Line Pattern Matching on Similar Texts}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{9:1--9:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.9},
  URN =		{urn:nbn:de:0030-drops-73379},
  doi =		{10.4230/LIPIcs.CPM.2017.9},
  annote =	{Keywords: string algorithms, pattern matching, degenerate strings, elastic-degenerate strings, on-line algorithms}
}
  • Refine by Author
  • 18 Grossi, Roberto
  • 7 Conte, Alessio
  • 5 Marino, Andrea
  • 4 Iliopoulos, Costas S.
  • 4 Pissis, Solon P.
  • Show More...

  • Refine by Classification
  • 5 Theory of computation → Pattern matching
  • 2 Mathematics of computing → Combinatorial algorithms
  • 1 Applied computing → Bioinformatics
  • 1 Information systems → Data dictionaries
  • 1 Information systems → Nearest-neighbor search
  • Show More...

  • Refine by Keyword
  • 3 pattern matching
  • 2 Graph algorithms
  • 2 Pattern Matching
  • 2 Text Indexing
  • 2 degenerate strings
  • Show More...

  • Refine by Type
  • 50 document
  • 1 volume

  • Refine by Publication Year
  • 33 2016
  • 3 2018
  • 3 2022
  • 2 2009
  • 2 2019
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail