Search Results

Documents authored by Cairo, Massimo


Document
Cut Paths and Their Remainder Structure, with Applications

Authors: Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu, and Elia C. Zirondelli

Published in: LIPIcs, Volume 254, 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)


Abstract
In a strongly connected graph G = (V,E), a cut arc (also called strong bridge) is an arc e ∈ E whose removal makes the graph no longer strongly connected. Equivalently, there exist u,v ∈ V, such that all u-v walks contain e. Cut arcs are a fundamental graph-theoretic notion, with countless applications, especially in reachability problems. In this paper we initiate the study of cut paths, as a generalisation of cut arcs, which we naturally define as those paths P for which there exist u,v ∈ V, such that all u-v walks contain P as subwalk. We first prove various properties of cut paths and define their remainder structures, which we use to present a simple O(m)-time verification algorithm for a cut path (|V| = n, |E| = m). Secondly, we apply cut paths and their remainder structures to improve several reachability problems from bioinformatics, as follows. A walk is called safe if it is a subwalk of every node-covering closed walk of a strongly connected graph. Multi-safety is defined analogously, by considering node-covering sets of closed walks instead. We show that cut paths provide simple O(m)-time algorithms verifying if a walk is safe or multi-safe. For multi-safety, we present the first linear time algorithm, while for safety, we present a simple algorithm where the state-of-the-art employed complex data structures. Finally we show that the simultaneous computation of remainder structures of all subwalks of a cut path can be performed in linear time, since they are related in a structured way. These properties yield an O(mn)-time algorithm outputting all maximal multi-safe walks, improving over the state-of-the-art algorithm running in time O(m²+n³). The results of this paper only scratch the surface in the study of cut paths, and we believe a rich structure of a graph can be revealed, considering the perspective of a path, instead of just an arc.

Cite as

Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu, and Elia C. Zirondelli. Cut Paths and Their Remainder Structure, with Applications. In 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 254, pp. 17:1-17:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.STACS.2023.17,
  author =	{Cairo, Massimo and Khan, Shahbaz and Rizzi, Romeo and Schmidt, Sebastian and Tomescu, Alexandru I. and Zirondelli, Elia C.},
  title =	{{Cut Paths and Their Remainder Structure, with Applications}},
  booktitle =	{40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)},
  pages =	{17:1--17:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-266-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{254},
  editor =	{Berenbrink, Petra and Bouyer, Patricia and Dawar, Anuj and Kant\'{e}, Mamadou Moustapha},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2023.17},
  URN =		{urn:nbn:de:0030-drops-176690},
  doi =		{10.4230/LIPIcs.STACS.2023.17},
  annote =	{Keywords: reachability, cut arc, strong bridge, covering walk, safety, persistence, essentiality, genome assembly}
}
Document
Width Helps and Hinders Splitting Flows

Authors: Manuel Cáceres, Massimo Cairo, Andreas Grigorjew, Shahbaz Khan, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu, and Lucia Williams

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)


Abstract
Minimum flow decomposition (MFD) is the NP-hard problem of finding a smallest decomposition of a network flow X on directed graph G into weighted source-to-sink paths whose superposition equals X. We focus on a common formulation of the problem where the path weights must be non-negative integers and also on a new variant where these weights can be negative. We show that, for acyclic graphs, considering the width of the graph (the minimum number of s-t paths needed to cover all of its edges) yields advances in our understanding of its approximability. For the non-negative version, we show that a popular heuristic is a O(log |X|)-approximation (|X| being the total flow of X) on graphs satisfying two properties related to the width (satisfied by e.g., series-parallel graphs), and strengthen its worst-case approximation ratio from Ω(√m) to Ω(m / log m) for sparse graphs, where m is the number of edges in the graph. For the negative version, we give a (⌈log ║X║⌉+1)-approximation (║X║ being the maximum absolute value of X on any edge) using a power-of-two approach, combined with parity fixing arguments and a decomposition of unitary flows (║X║ ≤ 1) into at most width paths. We also disprove a conjecture about the linear independence of minimum (non-negative) flow decompositions posed by Kloster et al. [ALENEX 2018], but show that its useful implication (polynomial-time assignments of weights to a given set of paths to decompose a flow) holds for the negative version.

Cite as

Manuel Cáceres, Massimo Cairo, Andreas Grigorjew, Shahbaz Khan, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu, and Lucia Williams. Width Helps and Hinders Splitting Flows. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 31:1-31:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{caceres_et_al:LIPIcs.ESA.2022.31,
  author =	{C\'{a}ceres, Manuel and Cairo, Massimo and Grigorjew, Andreas and Khan, Shahbaz and Mumey, Brendan and Rizzi, Romeo and Tomescu, Alexandru I. and Williams, Lucia},
  title =	{{Width Helps and Hinders Splitting Flows}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{31:1--31:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.31},
  URN =		{urn:nbn:de:0030-drops-169695},
  doi =		{10.4230/LIPIcs.ESA.2022.31},
  annote =	{Keywords: Flow decomposition, approximation algorithms, graph width}
}
Document
Track A: Algorithms, Complexity and Games
Genome Assembly, from Practice to Theory: Safe, Complete and Linear-Time

Authors: Massimo Cairo, Romeo Rizzi, Alexandru I. Tomescu, and Elia C. Zirondelli

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
Genome assembly asks to reconstruct an unknown string from many shorter substrings of it. Even though it is one of the key problems in Bioinformatics, it is generally lacking major theoretical advances. Its hardness stems both from practical issues (size and errors of real data), and from the fact that problem formulations inherently admit multiple solutions. Given these, at their core, most state-of-the-art assemblers are based on finding non-branching paths (unitigs) in an assembly graph. While such paths constitute only partial assemblies, they are likely to be correct. More precisely, if one defines a genome assembly solution as a closed arc-covering walk of the graph, then unitigs appear in all solutions, being thus safe partial solutions. Until recently, it was open what are all the safe walks of an assembly graph. Tomescu and Medvedev (RECOMB 2016) characterized all such safe walks (omnitigs), thus giving the first safe and complete genome assembly algorithm. Even though omnitig finding was later improved to quadratic time, it remained open whether the crucial linear-time feature of finding unitigs can be attained with omnitigs. We answer this question affirmatively, by describing a surprising O(m)-time algorithm to identify all maximal omnitigs of a graph with n nodes and m arcs, notwithstanding the existence of families of graphs with Θ(mn) total maximal omnitig size. This is based on the discovery of a family of walks (macrotigs) with the property that all the non-trivial omnitigs are univocal extensions of subwalks of a macrotig. This has two consequences: (1) A linear-time output-sensitive algorithm enumerating all maximal omnitigs. (2) A compact O(m) representation of all maximal omnitigs, which allows, e.g., for O(m)-time computation of various statistics on them. Our results close a long-standing theoretical question inspired by practical genome assemblers, originating with the use of unitigs in 1995. We envision our results to be at the core of a reverse transfer from theory to practical and complete genome assembly programs, as has been the case for other key Bioinformatics problems.

Cite as

Massimo Cairo, Romeo Rizzi, Alexandru I. Tomescu, and Elia C. Zirondelli. Genome Assembly, from Practice to Theory: Safe, Complete and Linear-Time. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 43:1-43:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.ICALP.2021.43,
  author =	{Cairo, Massimo and Rizzi, Romeo and Tomescu, Alexandru I. and Zirondelli, Elia C.},
  title =	{{Genome Assembly, from Practice to Theory: Safe, Complete and Linear-Time}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{43:1--43:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.43},
  URN =		{urn:nbn:de:0030-drops-141122},
  doi =		{10.4230/LIPIcs.ICALP.2021.43},
  annote =	{Keywords: Graph algorithm, strong connectivity, reachability under failures}
}
Document
Faster Dynamic Controllability Checking for Simple Temporal Networks with Uncertainty

Authors: Massimo Cairo, Luke Hunsberger, and Romeo Rizzi

Published in: LIPIcs, Volume 120, 25th International Symposium on Temporal Representation and Reasoning (TIME 2018)


Abstract
Simple Temporal Networks (STNs) are a well-studied model for representing and reasoning about time. An STN comprises a set of real-valued variables called time-points, together with a set of binary constraints, each of the form Y <= X+w. The problem of finding a feasible schedule (i.e., an assignment of real numbers to time-points such that all of the constraints are satisfied) is equivalent to the Single Source Shortest Path problem (SSSP) in the STN graph. Simple Temporal Networks with Uncertainty (STNUs) augment STNs to include contingent links that can be used, for example, to represent actions with uncertain durations. The duration of a contingent link is not controlled by the planner, but is instead controlled by a (possibly adversarial) environment. Each contingent link has the form, <A,l,u,C>, where 0 < l <= u < infty. Once the planner executes the activation time-point A, the environment must execute the contingent time-point C at some time A+Delta, where Delta in [l,u]. Crucially, the planner does not know the value of Delta in advance, but only discovers it when C executes. An STNU is dynamically controllable (DC) if there is a strategy that the planner can use to execute all of the non-contingent time-points, such that all of the constraints are guaranteed to be satisfied no matter which durations the environment chooses for the contingent links. The strategy can be dynamic in that it can react in real time to the contingent durations it observes. Recently, an upper bound of O(N^3) was given for the DC-checking problem for STNUs, where N is the number of time-points. This paper introduces a new algorithm, called the RUL^- algorithm, for solving the DC-checking problem for STNUs that improves on the O(N^3) bound. The worst-case complexity of the RUL^- algorithm is O(MN+K^2N+KN log N), where N is the number of time-points, M is the number of constraints, and K is the number of contingent time-points. If M is O(N^2), then the complexity reduces to O(N^3); however, in sparse graphs the complexity can be much less. For example, if M is O(N log N), and K is O(sqrt{N}), then the complexity of the RUL^- algorithm reduces to O(N^2 log N). The RUL^- algorithm begins by using the Bellman-Ford algorithm to compute a potential function. It then performs at most 2K rounds of computations, interleaving novel applications of Dijkstra's algorithm to (1) generate new edges and (2) update the potential function in response to those new edges. The constraint-propagation/edge-generation rules used by the RUL^- algorithm are distinguished from related work in two ways. First, they only generate unlabeled edges. Second, their applicability conditions are more restrictive. As a result, the RUL^- algorithm requires only O(K) rounds of Dijkstra's algorithm, instead of the O(N) rounds required by other approaches. The paper proves that the RUL^- algorithm is sound and complete for the DC-checking problem for STNUs.

Cite as

Massimo Cairo, Luke Hunsberger, and Romeo Rizzi. Faster Dynamic Controllability Checking for Simple Temporal Networks with Uncertainty. In 25th International Symposium on Temporal Representation and Reasoning (TIME 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 120, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.TIME.2018.8,
  author =	{Cairo, Massimo and Hunsberger, Luke and Rizzi, Romeo},
  title =	{{Faster Dynamic Controllability Checking for Simple Temporal Networks with Uncertainty}},
  booktitle =	{25th International Symposium on Temporal Representation and Reasoning (TIME 2018)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-089-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{120},
  editor =	{Alechina, Natasha and N{\o}rv\r{a}g, Kjetil and Penczek, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.TIME.2018.8},
  URN =		{urn:nbn:de:0030-drops-97734},
  doi =		{10.4230/LIPIcs.TIME.2018.8},
  annote =	{Keywords: Simple Temporal Networks with Uncertainty, Dynamic Controllability, Temporal Planning under Uncertainty}
}
Document
Dynamic Controllability Made Simple

Authors: Massimo Cairo and Romeo Rizzi

Published in: LIPIcs, Volume 90, 24th International Symposium on Temporal Representation and Reasoning (TIME 2017)


Abstract
Simple Temporal Networks with Uncertainty (STNUs) are a well-studied model for representing temporal constraints, where some intervals (contingent links) have an unknown but bounded duration, discovered only during execution. An STNU is dynamically controllable (DC) if there exists a strategy to execute its time-points satisfying all the constraints, regardless of the actual duration of contingent links revealed during execution. In this work we present a new system of constraint propagation rules for STNUs, which is sound-and-complete for DC checking. Our system comprises just three rules which, differently from the ones proposed in all previous works, only generate unconditioned constraints. In particular, after applying our sound rules, the network remains an STNU in all respects. Moreover, our completeness proof is short and non-algorithmic, based on the explicit construction of a valid execution strategy. This is a substantial simplification of the theory which underlies all the polynomial-time algorithms for DC-checking. Our analysis also shows: (1) the existence of late execution strategies for STNUs, (2) the equivalence of several variants of the notion of DC, (3) the existence of a fast algorithm for real-time execution of STNUs, which runs in O(KN) total time in a network with K contingent links and N time points, considerably improving the previous O(N^3)-time bound.

Cite as

Massimo Cairo and Romeo Rizzi. Dynamic Controllability Made Simple. In 24th International Symposium on Temporal Representation and Reasoning (TIME 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 90, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.TIME.2017.8,
  author =	{Cairo, Massimo and Rizzi, Romeo},
  title =	{{Dynamic Controllability Made Simple}},
  booktitle =	{24th International Symposium on Temporal Representation and Reasoning (TIME 2017)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-052-1},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{90},
  editor =	{Schewe, Sven and Schneider, Thomas and Wijsen, Jef},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.TIME.2017.8},
  URN =		{urn:nbn:de:0030-drops-79136},
  doi =		{10.4230/LIPIcs.TIME.2017.8},
  annote =	{Keywords: Simple Temporal Network with Uncertainty, Dynamic controllability}
}
Document
Incorporating Decision Nodes into Conditional Simple Temporal Networks

Authors: Massimo Cairo, Carlo Combi, Carlo Comin, Luke Hunsberger, Roberto Posenato, Romeo Rizzi, and Matteo Zavatteri

Published in: LIPIcs, Volume 90, 24th International Symposium on Temporal Representation and Reasoning (TIME 2017)


Abstract
A Conditional Simple Temporal Network (CSTN) augments a Simple Temporal Network (STN) to include special time-points, called observation time-points. In a CSTN, the agent executing the network controls the execution of every time-point. However, each observation time-point has a unique propositional letter associated with it and, when the agent executes that time-point, the environment assigns a truth value to the corresponding letter. Thus, the agent observes but, does not control the assignment of truth values. A CSTN is dynamically consistent (DC) if there exists a strategy for executing its time-points such that all relevant constraints will be satisfied no matter which truth values the environment assigns to the propositional letters. Alternatively, in a Labeled Simple Temporal Network (Labeled STN) - also called a Temporal Plan with Choice - the agent executing the network controls the assignment of values to the so-called choice variables. Furthermore, the agent can make those assignments at any time. For this reason, a Labeled STN is equivalent to a Disjunctive Temporal Network. This paper incorporates both of the above extensions by augmenting a CSTN to include not only observation time-points but also decision time-points. A decision time-point is like an observation time-point in that it has an associated propositional letter whose value is determined when the decision time-point is executed. It differs in that the agent - not the environment - selects that value. The resulting network is called a CSTN with Decisions (CSTND). This paper shows that a CSTND generalizes both CSTNs and Labeled STNs, and proves that the problem of determining whether any given CSTND is dynamically consistent is PSPACE-complete. It also presents algorithms that address two sub-classes of CSTNDs: (1) those that contain only decision time-points; and (2) those in which all decisions are made before execution begins.

Cite as

Massimo Cairo, Carlo Combi, Carlo Comin, Luke Hunsberger, Roberto Posenato, Romeo Rizzi, and Matteo Zavatteri. Incorporating Decision Nodes into Conditional Simple Temporal Networks. In 24th International Symposium on Temporal Representation and Reasoning (TIME 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 90, pp. 9:1-9:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.TIME.2017.9,
  author =	{Cairo, Massimo and Combi, Carlo and Comin, Carlo and Hunsberger, Luke and Posenato, Roberto and Rizzi, Romeo and Zavatteri, Matteo},
  title =	{{Incorporating Decision Nodes into Conditional Simple Temporal Networks}},
  booktitle =	{24th International Symposium on Temporal Representation and Reasoning (TIME 2017)},
  pages =	{9:1--9:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-052-1},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{90},
  editor =	{Schewe, Sven and Schneider, Thomas and Wijsen, Jef},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.TIME.2017.9},
  URN =		{urn:nbn:de:0030-drops-79155},
  doi =		{10.4230/LIPIcs.TIME.2017.9},
  annote =	{Keywords: Conditional Simple Temporal Networks with Decisions, Dynamic Consistency, SAT Solver, Hyper Temporal Networks, PSPACE}
}
Document
A Streamlined Model of Conditional Simple Temporal Networks - Semantics and Equivalence Results

Authors: Massimo Cairo, Luke Hunsberger, Roberto Posenato, and Romeo Rizzi

Published in: LIPIcs, Volume 90, 24th International Symposium on Temporal Representation and Reasoning (TIME 2017)


Abstract
A Conditional Simple Temporal Network (CSTN) augments a Simple Temporal Network to include a new kind of time-points, called observation time-points. The execution of an observation time-point generates information in real time, specifically, the truth value of a propositional letter. In addition, time-points and temporal constraints may be labeled by conjunctions of (positive or negative) propositional letters. A CSTN is called dynamically consistent (DC) if there exists a dynamic strategy for executing its time-points such that no matter how the observations turn out during execution, the time-points whose labels are consistent with those observations have all been executed, and the constraints whose labels are consistent with those observations have all been satisfied. The strategy is dynamic in that its execution decisions may react to observations. The original formulation of CSTNs included propositional labels only on time-points, but the DC-checking algorithm was impractical because it was based on a conversion of the semantic constraints into an exponentially-sized Disjunctive Temporal Network. Later work added propositional labels to temporal constraints, and yielded a sound-and-complete propagation-based DC-checking algorithm, empirically demonstrated to be practical across a variety of CSTNs. This paper introduces a streamlined version of a CSTN in which propositional labels may appear on constraints, but not on time-points. This change simplifies the definition of the DC property, as well as the propagation rules for the DC-checking algorithm. It also simplifies the proofs of the soundness and completeness of those rules. This paper provides two translations from traditional CSTNs to streamlined CSTNs. Each translation preserves the DC property and, for any DC network, ensures that any dynamic execution strategy for that network can be extended to a strategy for its streamlined counterpart. Finally, this paper presents an empirical comparison of two versions of the DC-checking algorithm: the original version and a simplified version for streamlined CSTNs. The comparison is based on CSTN benchmarks from earlier work. For small-sized CSTNs, the original version shows the best performance, but the performance difference between the two versions decreases as the number of time-points in the CSTN increases. We conclude that the simplified algorithm is a practical alternative for checking the dynamic consistency of CSTNs.

Cite as

Massimo Cairo, Luke Hunsberger, Roberto Posenato, and Romeo Rizzi. A Streamlined Model of Conditional Simple Temporal Networks - Semantics and Equivalence Results. In 24th International Symposium on Temporal Representation and Reasoning (TIME 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 90, pp. 10:1-10:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.TIME.2017.10,
  author =	{Cairo, Massimo and Hunsberger, Luke and Posenato, Roberto and Rizzi, Romeo},
  title =	{{A Streamlined Model of Conditional Simple Temporal Networks - Semantics and Equivalence Results}},
  booktitle =	{24th International Symposium on Temporal Representation and Reasoning (TIME 2017)},
  pages =	{10:1--10:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-052-1},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{90},
  editor =	{Schewe, Sven and Schneider, Thomas and Wijsen, Jef},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.TIME.2017.10},
  URN =		{urn:nbn:de:0030-drops-79145},
  doi =		{10.4230/LIPIcs.TIME.2017.10},
  annote =	{Keywords: Conditional Simple Temporal Networks, Dynamic Consistency, Temporal Constraints}
}
Document
Optimal Omnitig Listing for Safe and Complete Contig Assembly

Authors: Massimo Cairo, Paul Medvedev, Nidia Obscura Acosta, Romeo Rizzi, and Alexandru I. Tomescu

Published in: LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)


Abstract
Genome assembly is the problem of reconstructing a genome sequence from a set of reads from a sequencing experiment. Typical formulations of the assembly problem admit in practice many genomic reconstructions, and actual genome assemblers usually output contigs, namely substrings that are promised to occur in the genome. To bridge the theory and practice, Tomescu and Medvedev [RECOMB 2016] reformulated contig assembly as finding all substrings common to all genomic reconstructions. They also gave a characterization of those walks (omnitigs) that are common to all closed edge-covering walks of a (directed) graph, a typical notion of genomic reconstruction. An algorithm for listing all maximal omnitigs was also proposed, by launching an exhaustive visit from every edge. In this paper, we prove new insights about the structure of omnitigs and solve several open questions about them. We combine these to achieve an O(nm)-time algorithm for outputting all the maximal omnitigs of a graph (with n nodes and m edges). This is also optimal, as we show families of graphs whose total omnitig length is Omega(nm). We implement this algorithm and show that it is 9-12 times faster in practice than the one of Tomescu and Medvedev [RECOMB 2016].

Cite as

Massimo Cairo, Paul Medvedev, Nidia Obscura Acosta, Romeo Rizzi, and Alexandru I. Tomescu. Optimal Omnitig Listing for Safe and Complete Contig Assembly. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 29:1-29:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{cairo_et_al:LIPIcs.CPM.2017.29,
  author =	{Cairo, Massimo and Medvedev, Paul and Obscura Acosta, Nidia and Rizzi, Romeo and Tomescu, Alexandru I.},
  title =	{{Optimal Omnitig Listing for Safe and Complete Contig Assembly}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{29:1--29:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.29},
  URN =		{urn:nbn:de:0030-drops-73423},
  doi =		{10.4230/LIPIcs.CPM.2017.29},
  annote =	{Keywords: genome assembly, graph algorithm, edge-covering walk, strong bridge}
}