DROPS

Document

DOI: 10.4230/LIPIcs.WABI.2022.13

A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance

Authors: Marília D. V. Braga, Leonie R. Brockmann, Katharina Klerx, and Jens Stoye

Published in: LIPIcs, Volume 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)

Abstract

Two genomes over the same set of gene families form a canonical pair when each of them has exactly one gene from each family. A genome is circular when it contains only circular chromosomes. Different distances of canonical circular genomes can be derived from a structure called breakpoint graph, which represents the relation between the two given genomes as a collection of cycles of even length. Then, the breakpoint distance is equal to n-c_2, where n is the number of genes and c_2 is the number of cycles of length 2. Similarly, when the considered rearrangements are those modeled by the double-cut-and-join (DCJ) operation, the rearrangement distance is n-c, where c is the total number of cycles. The distance problem is a basic unit for several other combinatorial problems related to genome evolution and ancestral reconstruction, such as median or double distance. Interestingly, both median and double distance problems can be solved in polynomial time for the breakpoint distance, while they are NP-hard for the rearrangement distance. One way of exploring the complexity space between these two extremes is to consider a σ_k distance, defined to be n-(c_2+c_4+…+c_k), and increasingly investigate the complexities of median and double distance for the σ₄ distance, then the σ₆ distance, and so on. While for the median much effort was done in our and in other research groups but no progress was obtained even for the σ₄ distance, for solving the double distance under σ₄ and σ₆ distances we could devise linear time algorithms, which we present here.

Cite as

Marília D. V. Braga, Leonie R. Brockmann, Katharina Klerx, and Jens Stoye. A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 242, pp. 13:1-13:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{braga_et_al:LIPIcs.WABI.2022.13,
  author =	{Braga, Mar{\'\i}lia D. V. and Brockmann, Leonie R. and Klerx, Katharina and Stoye, Jens},
  title =	{{A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance}},
  booktitle =	{22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)},
  pages =	{13:1--13:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-243-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{242},
  editor =	{Boucher, Christina and Rahmann, Sven},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2022.13},
  URN =		{urn:nbn:de:0030-drops-170472},
  doi =		{10.4230/LIPIcs.WABI.2022.13},
  annote =	{Keywords: Comparative genomics, genome rearrangement, breakpoint distance, double-cut-and-join (DCJ) distance, double distance}
}

Document

DOI: 10.4230/DagRep.9.6.55

25 Years of the Burrows-Wheeler Transform (Dagstuhl Seminar 19241)

Authors: Travis Gagie, Giovanni Manzini, Gonzalo Navarro, and Jens Stoye

Published in: Dagstuhl Reports, Volume 9, Issue 6 (2020)

Abstract

Dagstuhl Seminar 19241 ("25 Years of the Burrows-Wheeler Transform") took place from June 10th to 14th, 2019, and was attended by 45 people from 13 countries and the three fields of Algorithms and Data Structures, Bioinformatics, and Combinatorics on Words. There were four talks and a panel session for each field. Feedback was generally positive and we are confident the seminar fostered interdisciplinary connections and will eventually result in noteworthy joint publications.

Cite as

Travis Gagie, Giovanni Manzini, Gonzalo Navarro, and Jens Stoye. 25 Years of the Burrows-Wheeler Transform (Dagstuhl Seminar 19241). In Dagstuhl Reports, Volume 9, Issue 6, pp. 55-68, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@Article{gagie_et_al:DagRep.9.6.55,
  author =	{Gagie, Travis and Manzini, Giovanni and Navarro, Gonzalo and Stoye, Jens},
  title =	{{25 Years of the Burrows-Wheeler Transform (Dagstuhl Seminar 19241)}},
  pages =	{55--68},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2019},
  volume =	{9},
  number =	{6},
  editor =	{Gagie, Travis and Manzini, Giovanni and Navarro, Gonzalo and Stoye, Jens},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.9.6.55},
  URN =		{urn:nbn:de:0030-drops-114874},
  doi =		{10.4230/DagRep.9.6.55},
  annote =	{Keywords: Bioinformatics, Burrows-Wheeler Transform, Combinatorics on Words, Data Compression, Data Structures, Indexing, Sequence Alignment}
}

Document

DOI: 10.4230/LIPIcs.WABI.2019.8

Finding All Maximal Perfect Haplotype Blocks in Linear Time

Authors: Jarno Alanko, Hideo Bannai, Bastien Cazaux, Pierre Peterlongo, and Jens Stoye

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)

Abstract

Recent large-scale community sequencing efforts allow at an unprecedented level of detail the identification of genomic regions that show signatures of natural selection. Traditional methods for identifying such regions from individuals' haplotype data, however, require excessive computing times and therefore are not applicable to current datasets. In 2019, Cunha et al. (Proceedings of BSB 2019) suggested the maximal perfect haplotype block as a very simple combinatorial pattern, forming the basis of a new method to perform rapid genome-wide selection scans. The algorithm they presented for identifying these blocks, however, had a worst-case running time quadratic in the genome length. It was posed as an open problem whether an optimal, linear-time algorithm exists. In this paper we give two algorithms that achieve this time bound, one conceptually very simple one using suffix trees and a second one using the positional Burrows-Wheeler Transform, that is very efficient also in practice.

Cite as

Jarno Alanko, Hideo Bannai, Bastien Cazaux, Pierre Peterlongo, and Jens Stoye. Finding All Maximal Perfect Haplotype Blocks in Linear Time. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 8:1-8:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{alanko_et_al:LIPIcs.WABI.2019.8,
  author =	{Alanko, Jarno and Bannai, Hideo and Cazaux, Bastien and Peterlongo, Pierre and Stoye, Jens},
  title =	{{Finding All Maximal Perfect Haplotype Blocks in Linear Time}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{8:1--8:9},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.8},
  URN =		{urn:nbn:de:0030-drops-110388},
  doi =		{10.4230/LIPIcs.WABI.2019.8},
  annote =	{Keywords: Population genomics, selection coefficient, haplotype block, positional Burrows-Wheeler Transform}
}

Document

DOI: 10.4230/LIPIcs.CPM.2017.19

Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings

Authors: Luís Cunha, Simone Dantas, Travis Gagie, Roland Wittler, Luis Kowada, and Jens Stoye

Published in: LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)

Abstract

Important papers have appeared recently on the problem of indexing binary strings for jumbled pattern matching, and further lowering the time bounds in terms of the input size would now be a breakthrough with broad implications. We can still make progress on the problem, however, by considering other natural parameters. Badkobeh et al. (IPL, 2013) and Amir et al. (TCS, 2016) gave algorithms that index a binary string in O(n + r^2 log r) time, where n is the length and r is the number of runs, and Giaquinta and Grabowski (IPL, 2013) gave one that runs in O(n + r^2) time. In this paper we propose a new and very simple algorithm that also runs in O(n + r^2) time and can be extended either so that the index returns the position of a match (if there is one), or so that the algorithm uses only O(n) bits of space instead of O(n) words.

Cite as

Luís Cunha, Simone Dantas, Travis Gagie, Roland Wittler, Luis Kowada, and Jens Stoye. Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 19:1-19:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{cunha_et_al:LIPIcs.CPM.2017.19,
  author =	{Cunha, Lu{\'\i}s and Dantas, Simone and Gagie, Travis and Wittler, Roland and Kowada, Luis and Stoye, Jens},
  title =	{{Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{19:1--19:9},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.19},
  URN =		{urn:nbn:de:0030-drops-73418},
  doi =		{10.4230/LIPIcs.CPM.2017.19},
  annote =	{Keywords: string algorithms, indexing, jumbled pattern matching, run-length encoding}
}

Document

DOI: 10.4230/DagSemProc.10231.2

A New Linear Time Algorithm to Compute the Genomic Distance Via the Double Cut and Join Distance

Authors: Anne Bergeron, Julia Mixtacki, and Jens Stoye

Published in: Dagstuhl Seminar Proceedings, Volume 10231, Structure Discovery in Biology: Motifs, Networks & Phylogenies (2010)

Abstract

The genomic distance problem in the Hannenhalli-Pevzner (HP) theory is the following: Given two genomes whose chromosomes are linear, calculate the minimum number of translocations, fusions, fissions and inversions that transform one genome into the other. We will present a new distance formula based on a simple tree structure that captures all the delicate features of this problem in a unifying way, and a linear-time algorithm for computing this distance.

Cite as

Anne Bergeron, Julia Mixtacki, and Jens Stoye. A New Linear Time Algorithm to Compute the Genomic Distance Via the Double Cut and Join Distance. In Structure Discovery in Biology: Motifs, Networks & Phylogenies. Dagstuhl Seminar Proceedings, Volume 10231, pp. 1-25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2010)

Copy BibTex To Clipboard

@InProceedings{bergeron_et_al:DagSemProc.10231.2,
  author =	{Bergeron, Anne and Mixtacki, Julia and Stoye, Jens},
  title =	{{A New Linear Time Algorithm to Compute the Genomic Distance Via the Double Cut and Join Distance}},
  booktitle =	{Structure Discovery in Biology: Motifs, Networks \& Phylogenies},
  pages =	{1--25},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2010},
  volume =	{10231},
  editor =	{Alberto Apostolico and Andreas Dress and Laxmi Parida},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.10231.2},
  URN =		{urn:nbn:de:0030-drops-26892},
  doi =		{10.4230/DagSemProc.10231.2},
  annote =	{Keywords: Comparative genomics, genomic distance computation, HP theory}
}

Search Results

Documents authored by Stoye, Jens

A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance

Abstract

Cite as

25 Years of the Burrows-Wheeler Transform (Dagstuhl Seminar 19241)

Abstract

Cite as

Finding All Maximal Perfect Haplotype Blocks in Linear Time

Abstract

Cite as

Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings

Abstract

Cite as

A New Linear Time Algorithm to Compute the Genomic Distance Via the Double Cut and Join Distance

Abstract

Cite as