Search Results

Documents authored by Punzi, Giulia


Document
Research
Designing Output Sensitive Algorithms for Subgraph Enumeration

Authors: Alessio Conte, Kazuhiro Kurita, Andrea Marino, Giulia Punzi, Takeaki Uno, and Kunihiro Wasa

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)


Abstract
The enumeration of all subgraphs respecting some structural property is a fundamental task in theoretical computer science, with practical applications in many branches of data mining and network analysis. It is often of interest to only consider solutions (subgraphs) that are maximal under inclusion, and to achieve output-sensitive complexity, i.e., bounding the running time with respect to the number of subgraphs produced. In this paper, we provide a survey of techniques for designing output-sensitive algorithms for subgraph enumeration, including partition-based approaches such as flashlight search, solution-graph traversal methods such as reverse search, and cost amortization strategies such as push-out amortization. We also briefly discuss classes of efficiency, hardness of enumeration, and variants such as approximate enumeration. The paper is meant as an accessible handbook for learning the basics of the field and as a practical reference for selecting state-of-the-art subgraph enumeration strategies fitting to one’s needs.

Cite as

Alessio Conte, Kazuhiro Kurita, Andrea Marino, Giulia Punzi, Takeaki Uno, and Kunihiro Wasa. Designing Output Sensitive Algorithms for Subgraph Enumeration. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 19:1-19:40, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:OASIcs.Grossi.19,
  author =	{Conte, Alessio and Kurita, Kazuhiro and Marino, Andrea and Punzi, Giulia and Uno, Takeaki and Wasa, Kunihiro},
  title =	{{Designing Output Sensitive Algorithms for Subgraph Enumeration}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{19:1--19:40},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.19},
  URN =		{urn:nbn:de:0030-drops-238180},
  doi =		{10.4230/OASIcs.Grossi.19},
  annote =	{Keywords: Graph algorithms, Graph enumeration, Output sensitive enumeration}
}
Document
Research
Subsequence-Based Indices for Genome Sequence Analysis

Authors: Giovanni Buzzega, Alessio Conte, Veronica Guerrini, Giulia Punzi, Giovanna Rosone, and Lorenzo Tattini

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)


Abstract
Compact indices are a fundamental tool in string analysis, even more so in bioinformatics, where genomic sequences can reach billions in length. This paper presents some recent results in which Roberto Grossi has been involved, showing how some of these indices do more than just efficiently represent data, but rather are able to bring out salient information within it, which can be exploited for their downstream analysis. Specifically, we first review a recently-introduced method [Guerrini et al., 2023] that employs the Burrows-Wheeler Transform to build reasonably accurate phylogenetic trees in an assembly-free scenario. We then describe a recent practical tool [Buzzega et al., 2025] for indexing Maximal Common Subsequences between strings, which can enable analysis of genomic sequence similarity. Experimentally, we show that the results produced by the one index are consistent with the expectations about the results of the other index.

Cite as

Giovanni Buzzega, Alessio Conte, Veronica Guerrini, Giulia Punzi, Giovanna Rosone, and Lorenzo Tattini. Subsequence-Based Indices for Genome Sequence Analysis. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 20:1-20:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{buzzega_et_al:OASIcs.Grossi.20,
  author =	{Buzzega, Giovanni and Conte, Alessio and Guerrini, Veronica and Punzi, Giulia and Rosone, Giovanna and Tattini, Lorenzo},
  title =	{{Subsequence-Based Indices for Genome Sequence Analysis}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{20:1--20:21},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.20},
  URN =		{urn:nbn:de:0030-drops-238199},
  doi =		{10.4230/OASIcs.Grossi.20},
  annote =	{Keywords: String Indices, Burrows-Wheeler Transform, Maximal Common Subsequences, Sequence Analysis, Phylogeny}
}
Document
McDag: Indexing Maximal Common Subsequences in Practice

Authors: Giovanni Buzzega, Alessio Conte, Roberto Grossi, and Giulia Punzi

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
Analyzing and comparing sequences of symbols is among the most fundamental problems in computer science, possibly even more so in bioinformatics. Maximal Common Subsequences (MCSs), i.e., inclusion-maximal sequences of non-contiguous symbols common to two or more strings, have only recently received attention in this area, despite being a basic notion and a natural generalization of more common tools like Longest Common Substrings/Subsequences. In this paper we simplify and engineer recent advancements on MCSs into a practical tool called McDag, the first publicly available tool that can index MCSs of real genomic data. We demonstrate that our tool can index sequences exceeding 10,000 base pairs within minutes, utilizing only 4-7% more than the minimum required nodes, while also extracting relevant insights.

Cite as

Giovanni Buzzega, Alessio Conte, Roberto Grossi, and Giulia Punzi. McDag: Indexing Maximal Common Subsequences in Practice. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{buzzega_et_al:LIPIcs.WABI.2024.21,
  author =	{Buzzega, Giovanni and Conte, Alessio and Grossi, Roberto and Punzi, Giulia},
  title =	{{McDag: Indexing Maximal Common Subsequences in Practice}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{21:1--21:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.21},
  URN =		{urn:nbn:de:0030-drops-206650},
  doi =		{10.4230/LIPIcs.WABI.2024.21},
  annote =	{Keywords: Index data structure, DAG, Common subsequence, Inclusion-wise maximality, LCS}
}
Document
Finding Diverse Strings and Longest Common Subsequences in a Graph

Authors: Yuto Shida, Giulia Punzi, Yasuaki Kobayashi, Takeaki Uno, and Hiroki Arimura

Published in: LIPIcs, Volume 296, 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)


Abstract
In this paper, we study for the first time the Diverse Longest Common Subsequences (LCSs) problem under Hamming distance. Given a set of a constant number of input strings, the problem asks to decide if there exists some subset X of K longest common subsequences whose diversity is no less than a specified threshold Δ, where we consider two types of diversities of a set X of strings of equal length: the Sum diversity and the Min diversity defined as the sum and the minimum of the pairwise Hamming distance between any two strings in X, respectively. We analyze the computational complexity of the respective problems with Sum- and Min-diversity measures, called the Max-Sum and Max-Min Diverse LCSs, respectively, considering both approximation algorithms and parameterized complexity. Our results are summarized as follows. When K is bounded, both problems are polynomial time solvable. In contrast, when K is unbounded, both problems become NP-hard, while Max-Sum Diverse LCSs problem admits a PTAS. Furthermore, we analyze the parameterized complexity of both problems with combinations of parameters K and r, where r is the length of the candidate strings to be selected. Importantly, all positive results above are proven in a more general setting, where an input is an edge-labeled directed acyclic graph (DAG) that succinctly represents a set of strings of the same length. Negative results are proven in the setting where an input is explicitly given as a set of strings. The latter results are equipped with an encoding such a set as the longest common subsequences of a specific input string set.

Cite as

Yuto Shida, Giulia Punzi, Yasuaki Kobayashi, Takeaki Uno, and Hiroki Arimura. Finding Diverse Strings and Longest Common Subsequences in a Graph. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 27:1-27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{shida_et_al:LIPIcs.CPM.2024.27,
  author =	{Shida, Yuto and Punzi, Giulia and Kobayashi, Yasuaki and Uno, Takeaki and Arimura, Hiroki},
  title =	{{Finding Diverse Strings and Longest Common Subsequences in a Graph}},
  booktitle =	{35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)},
  pages =	{27:1--27:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-326-3},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{296},
  editor =	{Inenaga, Shunsuke and Puglisi, Simon J.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2024.27},
  URN =		{urn:nbn:de:0030-drops-201370},
  doi =		{10.4230/LIPIcs.CPM.2024.27},
  annote =	{Keywords: Sequence analysis, longest common subsequence, Hamming distance, dispersion, approximation algorithms, parameterized complexity}
}
Document
A Compact DAG for Storing and Searching Maximal Common Subsequences

Authors: Alessio Conte, Roberto Grossi, Giulia Punzi, and Takeaki Uno

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)


Abstract
Maximal Common Subsequences (MCSs) between two strings X and Y are subsequences of both X and Y that are maximal under inclusion. MCSs relax and generalize the well known and widely used concept of Longest Common Subsequences (LCSs), which can be seen as MCSs of maximum length. While the number both LCSs and MCSs can be exponential in the length of the strings, LCSs have been long exploited for string and text analysis, as simple compact representations of all LCSs between two strings, built via dynamic programming or automata, have been known since the '70s. MCSs appear to have a more challenging structure: even listing them efficiently was an open problem open until recently, thus narrowing the complexity difference between the two problems, but the gap remained significant. In this paper we close the complexity gap: we show how to build DAG of polynomial size - in polynomial time - which allows for efficient operations on the set of all MCSs such as enumeration in Constant Amortized Time per solution (CAT), counting, and random access to the i-th element (i.e., rank and select operations). Other than improving known algorithmic results, this work paves the way for new sequence analysis methods based on MCSs.

Cite as

Alessio Conte, Roberto Grossi, Giulia Punzi, and Takeaki Uno. A Compact DAG for Storing and Searching Maximal Common Subsequences. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 21:1-21:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:LIPIcs.ISAAC.2023.21,
  author =	{Conte, Alessio and Grossi, Roberto and Punzi, Giulia and Uno, Takeaki},
  title =	{{A Compact DAG for Storing and Searching Maximal Common Subsequences}},
  booktitle =	{34th International Symposium on Algorithms and Computation (ISAAC 2023)},
  pages =	{21:1--21:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-289-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{283},
  editor =	{Iwata, Satoru and Kakimura, Naonori},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.21},
  URN =		{urn:nbn:de:0030-drops-193231},
  doi =		{10.4230/LIPIcs.ISAAC.2023.21},
  annote =	{Keywords: Maximal common subsequence, DAG, Compact data structures, Enumeration, Constant amortized time, Random access}
}
Document
On Strings Having the Same Length- k Substrings

Authors: Giulia Bernardini, Alessio Conte, Esteban Gabory, Roberto Grossi, Grigorios Loukides, Solon P. Pissis, Giulia Punzi, and Michelle Sweering

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
Let Substr_k(X) denote the set of length-k substrings of a given string X for a given integer k > 0. We study the following basic string problem, called z-Shortest 𝒮_k-Equivalent Strings: Given a set 𝒮_k of n length-k strings and an integer z > 0, list z shortest distinct strings T₁,…,T_z such that Substr_k(T_i) = 𝒮_k, for all i ∈ [1,z]. The z-Shortest 𝒮_k-Equivalent Strings problem arises naturally as an encoding problem in many real-world applications; e.g., in data privacy, in data compression, and in bioinformatics. The 1-Shortest 𝒮_k-Equivalent Strings, referred to as Shortest 𝒮_k-Equivalent String, asks for a shortest string X such that Substr_k(X) = 𝒮_k. Our main contributions are summarized below: - Given a directed graph G(V,E), the Directed Chinese Postman (DCP) problem asks for a shortest closed walk that visits every edge of G at least once. DCP can be solved in 𝒪̃(|E||V|) time using an algorithm for min-cost flow. We show, via a non-trivial reduction, that if Shortest 𝒮_k-Equivalent String over a binary alphabet has a near-linear-time solution then so does DCP. - We show that the length of a shortest string output by Shortest 𝒮_k-Equivalent String is in 𝒪(k+n²). We generalize this bound by showing that the total length of z shortest strings is in 𝒪(zk+zn²+z²n). We derive these upper bounds by showing (asymptotically tight) bounds on the total length of z shortest Eulerian walks in general directed graphs. - We present an algorithm for solving z-Shortest 𝒮_k-Equivalent Strings in 𝒪(nk+n²log²n+zn²log n+|output|) time. If z = 1, the time becomes 𝒪(nk+n²log²n) by the fact that the size of the input is Θ(nk) and the size of the output is 𝒪(k+n²).

Cite as

Giulia Bernardini, Alessio Conte, Esteban Gabory, Roberto Grossi, Grigorios Loukides, Solon P. Pissis, Giulia Punzi, and Michelle Sweering. On Strings Having the Same Length- k Substrings. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 16:1-16:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{bernardini_et_al:LIPIcs.CPM.2022.16,
  author =	{Bernardini, Giulia and Conte, Alessio and Gabory, Esteban and Grossi, Roberto and Loukides, Grigorios and Pissis, Solon P. and Punzi, Giulia and Sweering, Michelle},
  title =	{{On Strings Having the Same Length- k Substrings}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{16:1--16:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.16},
  URN =		{urn:nbn:de:0030-drops-161439},
  doi =		{10.4230/LIPIcs.CPM.2022.16},
  annote =	{Keywords: string algorithms, combinatorics on words, de Bruijn graph, Chinese Postman}
}
Document
Enumeration of s-d Separators in DAGs with Application to Reliability Analysis in Temporal Graphs

Authors: Alessio Conte, Pierluigi Crescenzi, Andrea Marino, and Giulia Punzi

Published in: LIPIcs, Volume 170, 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)


Abstract
Temporal graphs are graphs in which arcs have temporal labels, specifying at which time they can be traversed. Motivated by recent results concerning the reliability analysis of a temporal graph through the enumeration of minimal cutsets in the corresponding line graph, in this paper we attack the problem of enumerating minimal s-d separators in s-d directed acyclic graphs (in short, s-d DAGs), also known as 2-terminal DAGs or s-t digraphs. Our main result is an algorithm for enumerating all the minimal s-d separators in a DAG with O(nm) delay, where n and m are respectively the number of nodes and arcs, and the delay is the time between the output of two consecutive solutions. To this aim, we give a characterization of the minimal s-d separators in a DAG through vertex cuts of an expanded version of the DAG itself. As a consequence of our main result, we provide an algorithm for enumerating all the minimal s-d cutsets in a temporal graph with delay O(m³), where m is the number of temporal arcs.

Cite as

Alessio Conte, Pierluigi Crescenzi, Andrea Marino, and Giulia Punzi. Enumeration of s-d Separators in DAGs with Application to Reliability Analysis in Temporal Graphs. In 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 170, pp. 25:1-25:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{conte_et_al:LIPIcs.MFCS.2020.25,
  author =	{Conte, Alessio and Crescenzi, Pierluigi and Marino, Andrea and Punzi, Giulia},
  title =	{{Enumeration of s-d Separators in DAGs with Application to Reliability Analysis in Temporal Graphs}},
  booktitle =	{45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)},
  pages =	{25:1--25:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-159-7},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{170},
  editor =	{Esparza, Javier and Kr\'{a}l', Daniel},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2020.25},
  URN =		{urn:nbn:de:0030-drops-126932},
  doi =		{10.4230/LIPIcs.MFCS.2020.25},
  annote =	{Keywords: minimal cutset, temporal graph, minimal separator, directed acyclic graph}
}
Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail