Search Results

Documents authored by Baumbach, Jan


Document
GEDEVO: An Evolutionary Graph Edit Distance Algorithm for Biological Network Alignment

Authors: Rashid Ibragimov, Maximilian Malek, Jiong Guo, and Jan Baumbach

Published in: OASIcs, Volume 34, German Conference on Bioinformatics 2013


Abstract
Introduction: With the so-called OMICS technology the scientific community has generated huge amounts of data that allow us to reconstruct the interplay of all kinds of biological entities. The emerging interaction networks are usually modeled as graphs with thousands of nodes and tens of thousands of edges between them. In addition to sequence alignment, the comparison of biological networks has proven great potential to infer the biological function of proteins and genes. However, the corresponding network alignment problem is computationally hard and theoretically intractable for real world instances. Results: We therefore developed GEDEVO, a novel tool for efficient graph comparison dedicated to real-world size biological networks. Underlying our approach is the so-called Graph Edit Distance (GED) model, where one graph is to be transferred into another one, with a minimal number of (or more general: minimal costs for) edge insertions and deletions. We present a novel evolutionary algorithm aiming to minimize the GED, and we compare our implementation against state of the art tools: SPINAL, GHOST, \CGRAAL, and \MIGRAAL. On a set of protein-protein interaction networks from different organisms we demonstrate that GEDEVO outperforms the current methods. It thus refines the previously suggested alignments based on topological information only. Conclusion: With GEDEVO, we account for the constantly exploding number and size of available biological networks. The software as well as all used data sets are publicly available at http://gedevo.mpi-inf.mpg.de.

Cite as

Rashid Ibragimov, Maximilian Malek, Jiong Guo, and Jan Baumbach. GEDEVO: An Evolutionary Graph Edit Distance Algorithm for Biological Network Alignment. In German Conference on Bioinformatics 2013. Open Access Series in Informatics (OASIcs), Volume 34, pp. 68-79, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@InProceedings{ibragimov_et_al:OASIcs.GCB.2013.68,
  author =	{Ibragimov, Rashid and Malek, Maximilian and Guo, Jiong and Baumbach, Jan},
  title =	{{GEDEVO: An Evolutionary Graph Edit Distance Algorithm for Biological Network Alignment}},
  booktitle =	{German Conference on Bioinformatics 2013},
  pages =	{68--79},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-59-0},
  ISSN =	{2190-6807},
  year =	{2013},
  volume =	{34},
  editor =	{Bei{\ss}barth, Tim and Kollmar, Martin and Leha, Andreas and Morgenstern, Burkhard and Schultz, Anne-Kathrin and Waack, Stephan and Wingender, Edgar},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.GCB.2013.68},
  URN =		{urn:nbn:de:0030-drops-42298},
  doi =		{10.4230/OASIcs.GCB.2013.68},
  annote =	{Keywords: Network Alignment, Graph Edit Distance, Evolutionary Algorithm, Protein-Protein Interactions}
}
Document
Online Transitivity Clustering of Biological Data with Missing Values

Authors: Richard Röttger, Christoph Kreutzer, Thuy Duong Vu, Tobias Wittkop, and Jan Baumbach

Published in: OASIcs, Volume 26, German Conference on Bioinformatics 2012


Abstract
Motivation: Equipped with sophisticated biochemical measurement techniques we generate a massive amount of biomedical data that needs to be analyzed computationally. One long-standing challenge in automatic knowledge extraction is clustering. We seek to partition a set of objects into groups such that the objects within the clusters share common traits. Usually, we have given a similarity matrix computed from a pairwise similarity function. While many approaches for biomedical data clustering exist, most methods neglect two important problems: (1) Computing the similarity matrix might not be trivial but resource-intense. (2) A clustering algorithm itself is not sufficient for the biologist, who needs an integrated online system capable of performing preparative and follow-up tasks as well. Results: Here, we present a significantly extended version of Transitivity Clustering. Our first main contribution is its’ capability of dealing with missing values in the similarity matrix such that we save time and memory. Hence, we reduce one main bottleneck of computing all pairwise similarity values. We integrated this functionality into the Weighted Graph Cluster Editing model underlying Transitivity Clustering. By means of identifying protein (super)families from incomplete all-vs-all BLAST results we demonstrate the robustness of our approach. While most tools concentrate on the partitioning process itself, we present a new, intuitive web interface that aids with all important steps of a cluster analysis: (1) computing and post-processing of a similarity matrix, (2) estimation of a meaningful density parameter, (3) clustering, (4) comparison with given gold standards, and (5) fine-tuning of the clustering by varying the parameters. Availability: Transitivity Clustering, the new Cost Matrix Creator, all used data sets as well as an online documentation are online available at http://transclust.mmci.uni-saarland.de/.

Cite as

Richard Röttger, Christoph Kreutzer, Thuy Duong Vu, Tobias Wittkop, and Jan Baumbach. Online Transitivity Clustering of Biological Data with Missing Values. In German Conference on Bioinformatics 2012. Open Access Series in Informatics (OASIcs), Volume 26, pp. 57-68, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InProceedings{rottger_et_al:OASIcs.GCB.2012.57,
  author =	{R\"{o}ttger, Richard and Kreutzer, Christoph and Vu, Thuy Duong and Wittkop, Tobias and Baumbach, Jan},
  title =	{{Online Transitivity Clustering of Biological Data with Missing Values}},
  booktitle =	{German Conference on Bioinformatics 2012},
  pages =	{57--68},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-44-6},
  ISSN =	{2190-6807},
  year =	{2012},
  volume =	{26},
  editor =	{B\"{o}cker, Sebastian and Hufsky, Franziska and Scheubert, Kerstin and Schleicher, Jana and Schuster, Stefan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.GCB.2012.57},
  URN =		{urn:nbn:de:0030-drops-37189},
  doi =		{10.4230/OASIcs.GCB.2012.57},
  annote =	{Keywords: Transitivity Clustering, Large Scale clustering, Missing Values, Web Interface}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail