4 Search Results for "Marcus, Shoshana"


Document
Double String Tandem Repeats

Authors: Amihood Amir, Ayelet Butman, Gad M. Landau, Shoshana Marcus, and Dina Sokol

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
A tandem repeat is an occurrence of two adjacent identical substrings. In this paper, we introduce the notion of a double string, which consists of two parallel strings, and we study the problem of locating all tandem repeats in a double string. The problem introduced here has applications beyond actual double strings, as we illustrate by solving two different problems with the algorithm of the double string tandem repeats problem. The first problem is that of finding all corner-sharing tandems in a 2-dimensional text, defined by Apostolico and Brimkov. The second problem is that of finding all scaled tandem repeats in a 1d text, where a scaled tandem repeat is defined as a string UU' such that U' is discrete scale of U. In addition to the algorithms for exact tandem repeats, we also present algorithms that solve the problem in the inexact sense, allowing up to k mismatches. We believe that this framework will open a new perspective for other problems in the future.

Cite as

Amihood Amir, Ayelet Butman, Gad M. Landau, Shoshana Marcus, and Dina Sokol. Double String Tandem Repeats. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 3:1-3:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.CPM.2020.3,
  author =	{Amir, Amihood and Butman, Ayelet and Landau, Gad M. and Marcus, Shoshana and Sokol, Dina},
  title =	{{Double String Tandem Repeats}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{3:1--3:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.3},
  URN =		{urn:nbn:de:0030-drops-121283},
  doi =		{10.4230/LIPIcs.CPM.2020.3},
  annote =	{Keywords: double string, tandem repeat, 2-dimensional, scale}
}
Document
Read Mapping on Genome Variation Graphs

Authors: Kavya Vaddadi, Rajgopal Srinivasan, and Naveen Sivadasan

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)


Abstract
Genome variation graphs are natural candidates to represent a pangenome collection. In such graphs, common subsequences are encoded as vertices and the genomic variations are captured by introducing additional labeled vertices and directed edges. Unlike a linear reference, a reference graph allows a rich representation of the genomic diversities and avoids reference bias. We address the fundamental problem of mapping reads to genome variation graphs. We give a novel mapping algorithm V-MAP for efficient identification of small subgraph of the genome graph for optimal gapped alignment of the read. V-MAP creates space efficient index using locality sensitive minimizer signatures computed using a novel graph winnowing and graph embedding onto metric space for fast and accurate mapping. Experiments involving graph constructed from the 1000 Genomes data and using both real and simulated reads show that V-MAP is fast, memory efficient and can map short reads, as well as PacBio/Nanopore long reads with high accuracy. V-MAP performance was significantly better than the state-of-the-art, especially for long reads.

Cite as

Kavya Vaddadi, Rajgopal Srinivasan, and Naveen Sivadasan. Read Mapping on Genome Variation Graphs. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{vaddadi_et_al:LIPIcs.WABI.2019.7,
  author =	{Vaddadi, Kavya and Srinivasan, Rajgopal and Sivadasan, Naveen},
  title =	{{Read Mapping on Genome Variation Graphs}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{7:1--7:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.7},
  URN =		{urn:nbn:de:0030-drops-110375},
  doi =		{10.4230/LIPIcs.WABI.2019.7},
  annote =	{Keywords: read mapping, pangenome, genome variation graphs, locality sensitive hashing}
}
Document
Validating Paired-End Read Alignments in Sequence Graphs

Authors: Chirag Jain, Haowen Zhang, Alexander Dilthey, and Srinivas Aluru

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)


Abstract
Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs enable incorporation of full genomic diversity within a population. However, transitioning from a simple string-based reference to graphs requires addressing many computational challenges, one of which concerns accurately mapping sequencing read sets to graphs. Paired-end Illumina sequencing is a commonly used sequencing platform in genomics, where the paired-end distance constraints allow disambiguation of repeats. Many recent works have explored provably good index-based and alignment-based strategies for mapping individual reads to graphs. However, validating distance constraints efficiently over graphs is not trivial, and existing sequence to graph mappers rely on heuristics. We introduce a mathematical formulation of the problem, and provide a new algorithm to solve it exactly. We take advantage of the high sparsity of reference graphs, and use sparse matrix-matrix multiplications (SpGEMM) to build an index which can be queried efficiently by a mapping algorithm for validating the distance constraints. Effectiveness of the algorithm is demonstrated using real reference graphs, including a human MHC variation graph, and a pan-genome de-Bruijn graph built using genomes of 20 B. anthracis strains. While the one-time indexing time can vary from a few minutes to a few hours using our algorithm, answering a million distance queries takes less than a second.

Cite as

Chirag Jain, Haowen Zhang, Alexander Dilthey, and Srinivas Aluru. Validating Paired-End Read Alignments in Sequence Graphs. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{jain_et_al:LIPIcs.WABI.2019.17,
  author =	{Jain, Chirag and Zhang, Haowen and Dilthey, Alexander and Aluru, Srinivas},
  title =	{{Validating Paired-End Read Alignments in Sequence Graphs}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{17:1--17:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.17},
  URN =		{urn:nbn:de:0030-drops-110470},
  doi =		{10.4230/LIPIcs.WABI.2019.17},
  annote =	{Keywords: Sequence graphs, read mapping, index, sparse matrix-matrix multiplication}
}
Document
Two-Dimensional Maximal Repetitions

Authors: Amihood Amir, Gad M. Landau, Shoshana Marcus, and Dina Sokol

Published in: LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)


Abstract
Maximal repetitions or runs in strings have a wide array of applications and thus have been extensively studied. In this paper, we extend this notion to 2-dimensions, precisely defining a maximal 2D repetition. We provide initial bounds on the number of maximal 2D repetitions that can occur in a matrix. The main contribution of this paper is the presentation of the first algorithm for locating all maximal 2D repetitions in a matrix. The algorithm is efficient and straightforward, with runtime O(n^2 log n log log n+ rho log n), where n^2 is the size of the input, and rho is the number of 2D repetitions in the output.

Cite as

Amihood Amir, Gad M. Landau, Shoshana Marcus, and Dina Sokol. Two-Dimensional Maximal Repetitions. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 2:1-2:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.ESA.2018.2,
  author =	{Amir, Amihood and Landau, Gad M. and Marcus, Shoshana and Sokol, Dina},
  title =	{{Two-Dimensional Maximal Repetitions}},
  booktitle =	{26th Annual European Symposium on Algorithms (ESA 2018)},
  pages =	{2:1--2:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-081-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{112},
  editor =	{Azar, Yossi and Bast, Hannah and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.2},
  URN =		{urn:nbn:de:0030-drops-94652},
  doi =		{10.4230/LIPIcs.ESA.2018.2},
  annote =	{Keywords: pattern matching algorithms, repetitions, periodicity, two-dimensional}
}
  • Refine by Author
  • 2 Amir, Amihood
  • 2 Landau, Gad M.
  • 2 Marcus, Shoshana
  • 2 Sokol, Dina
  • 1 Aluru, Srinivas
  • Show More...

  • Refine by Classification
  • 2 Applied computing → Computational genomics
  • 2 Theory of computation → Design and analysis of algorithms
  • 1 Computing methodologies → Combinatorial algorithms
  • 1 Mathematics of computing → Combinatorics on words
  • 1 Mathematics of computing → Paths and connectivity problems
  • Show More...

  • Refine by Keyword
  • 2 read mapping
  • 1 2-dimensional
  • 1 Sequence graphs
  • 1 double string
  • 1 genome variation graphs
  • Show More...

  • Refine by Type
  • 4 document

  • Refine by Publication Year
  • 2 2019
  • 1 2018
  • 1 2020

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail