Search Results

Documents authored by Shen, Yihang


Document
Revisiting the Complexity of and Algorithms for the Graph Traversal Edit Distance and Its Variants

Authors: Yutong Qiu, Yihang Shen, and Carl Kingsford

Published in: LIPIcs, Volume 273, 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)


Abstract
The graph traversal edit distance (GTED), introduced by Ebrahimpour Boroojeny et al. (2018), is an elegant distance measure defined as the minimum edit distance between strings reconstructed from Eulerian trails in two edge-labeled graphs. GTED can be used to infer evolutionary relationships between species by comparing de Bruijn graphs directly without the computationally costly and error-prone process of genome assembly. Ebrahimpour Boroojeny et al. (2018) propose two ILP formulations for GTED and claim that GTED is polynomially solvable because the linear programming relaxation of one of the ILPs will always yield optimal integer solutions. The claim that GTED is polynomially solvable is contradictory to the complexity of existing string-to-graph matching problems. We resolve this conflict in complexity results by proving that GTED is NP-complete and showing that the ILPs proposed by Ebrahimpour Boroojeny et al. do not solve GTED but instead solve for a lower bound of GTED and are not solvable in polynomial time. In addition, we provide the first two, correct ILP formulations of GTED and evaluate their empirical efficiency. These results provide solid algorithmic foundations for comparing genome graphs and point to the direction of heuristics that estimate GTED efficiently.

Cite as

Yutong Qiu, Yihang Shen, and Carl Kingsford. Revisiting the Complexity of and Algorithms for the Graph Traversal Edit Distance and Its Variants. In 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 273, pp. 11:1-11:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{qiu_et_al:LIPIcs.WABI.2023.11,
  author =	{Qiu, Yutong and Shen, Yihang and Kingsford, Carl},
  title =	{{Revisiting the Complexity of and Algorithms for the Graph Traversal Edit Distance and Its Variants}},
  booktitle =	{23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)},
  pages =	{11:1--11:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-294-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{273},
  editor =	{Belazzougui, Djamal and Ouangraoua, A\"{i}da},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2023.11},
  URN =		{urn:nbn:de:0030-drops-186374},
  doi =		{10.4230/LIPIcs.WABI.2023.11},
  annote =	{Keywords: Integer Linear Programming, Genome Graphs, Flow Network, Graph Comparison}
}
Document
Topological Data Analysis Reveals Principles of Chromosome Structure in Cellular Differentiation

Authors: Natalie Sauerwald, Yihang Shen, and Carl Kingsford

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)


Abstract
Topological data analysis (TDA) is a mathematically well-founded set of methods to derive robust information about the structure and topology of data. It has been applied successfully in several biological contexts. Derived primarily from algebraic topology, TDA rigorously identifies persistent features in complex data, making it well-suited to better understand the key features of three-dimensional chromosome structure. Chromosome structure has a significant influence in many diverse genomic processes and has recently been shown to relate to cellular differentiation. While there exist many methods to study specific substructures of chromosomes, we are still missing a global view of all geometric features of chromosomes. By applying TDA to the study of chromosome structure through differentiation across three cell lines, we provide insight into principles of chromosome folding and looping. We identify persistent connected components and one-dimensional topological features of chromosomes and characterize them across cell types and stages of differentiation. Availability: Scripts to reproduce the results from this study can be found at https://github.com/Kingsford-Group/hictda

Cite as

Natalie Sauerwald, Yihang Shen, and Carl Kingsford. Topological Data Analysis Reveals Principles of Chromosome Structure in Cellular Differentiation. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 23:1-23:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{sauerwald_et_al:LIPIcs.WABI.2019.23,
  author =	{Sauerwald, Natalie and Shen, Yihang and Kingsford, Carl},
  title =	{{Topological Data Analysis Reveals Principles of Chromosome Structure in Cellular Differentiation}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{23:1--23:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.23},
  URN =		{urn:nbn:de:0030-drops-110537},
  doi =		{10.4230/LIPIcs.WABI.2019.23},
  annote =	{Keywords: topological data analysis, chromosome structure, Hi-C, topologically associating domains}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail