A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance

Authors Marília D. V. Braga , Leonie R. Brockmann, Katharina Klerx, Jens Stoye



PDF
Thumbnail PDF

File

LIPIcs.WABI.2022.13.pdf
  • Filesize: 0.95 MB
  • 16 pages

Document Identifiers

Author Details

Marília D. V. Braga
  • Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany
Leonie R. Brockmann
  • Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany
Katharina Klerx
  • Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany
Jens Stoye
  • Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany

Acknowledgements

We would like to thank Cedric Chauve for bringing our attention to the class of σ_k distances as a means for studying the hardness bound between the breakpoint distance and the DCJ distance in combinatorial problems related to genome evolution. Thanks also to Eloi Araujo, Daniel Doerr and Fábio H. V. Martinez for helping us studying the median problem under this class.

Cite As Get BibTex

Marília D. V. Braga, Leonie R. Brockmann, Katharina Klerx, and Jens Stoye. A Linear Time Algorithm for an Extended Version of the Breakpoint Double Distance. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 242, pp. 13:1-13:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.WABI.2022.13

Abstract

Two genomes over the same set of gene families form a canonical pair when each of them has exactly one gene from each family. A genome is circular when it contains only circular chromosomes. Different distances of canonical circular genomes can be derived from a structure called breakpoint graph, which represents the relation between the two given genomes as a collection of cycles of even length. Then, the breakpoint distance is equal to n-c_2, where n is the number of genes and c_2 is the number of cycles of length 2. Similarly, when the considered rearrangements are those modeled by the double-cut-and-join (DCJ) operation, the rearrangement distance is n-c, where c is the total number of cycles.
The distance problem is a basic unit for several other combinatorial problems related to genome evolution and ancestral reconstruction, such as median or double distance. Interestingly, both median and double distance problems can be solved in polynomial time for the breakpoint distance, while they are NP-hard for the rearrangement distance. One way of exploring the complexity space between these two extremes is to consider a σ_k distance, defined to be n-(c_2+c_4+…+c_k), and increasingly investigate the complexities of median and double distance for the σ₄ distance, then the σ₆ distance, and so on. While for the median much effort was done in our and in other research groups but no progress was obtained even for the σ₄ distance, for solving the double distance under σ₄ and σ₆ distances we could devise linear time algorithms, which we present here.

Subject Classification

ACM Subject Classification
  • Applied computing → Bioinformatics
Keywords
  • Comparative genomics
  • genome rearrangement
  • breakpoint distance
  • double-cut-and-join (DCJ) distance
  • double distance

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Max Alekseyev and Pavel A. Pevzner. Colored de Bruijn graphs and the genome halving problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(1):98-107, 2008. URL: https://doi.org/10.1109/TCBB.2007.1002.
  2. Vineet Bafna and Pavel A. Pevzner. Genome rearrangements and sorting by reversals. In Proceedings of FOCS 1993, pages 148-157, 1993. URL: https://doi.org/10.1109/SFCS.1993.366872.
  3. Anne Bergeron, Julia Mixtacki, and Jens Stoye. A unifying view of genome rearrangements. In Proceedings of WABI 2006, volume 4175 of LNBI, pages 163-173, 2006. URL: https://doi.org/10.1007/11851561_16.
  4. Cedric Chauve. Personal communication in Dagstuhl Seminar no. 18451 - Genomics, Pattern Avoidance, and Statistical Mechanics, November 2018. Google Scholar
  5. Nadia El-Mabrouk and David Sankoff. The reconstruction of doubled genomes. SIAM Journal on Computing, 32(3):754-792, 2003. URL: https://doi.org/10.1137/S0097539700377177.
  6. Sridhar Hannenhalli and Pavel A. Pevzner. Transforming men into mice (polynomial algorithm for genomic distance problem). In Proceedings of FOCS 1995, pages 581-592. IEEE Press, 1995. URL: https://doi.org/10.1109/SFCS.1995.492588.
  7. Sridhar Hannenhalli and Pavel A. Pevzner. Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. Journal of the ACM, 46(1):1-27, 1999. URL: https://doi.org/10.1145/300515.300516.
  8. João Meidanis, Maria Emília M. T. Walter, and Zanoni Dias. Reversal distance of signed circular chromosomes. Technical Report IC-00-23, Institute of Computing, University of Campinas, Brazil, 2000. URL: https://www.ic.unicamp.br/~reltech/2000/00-23.pdf.
  9. Julia Mixtacki. Genome halving under DCJ revisited. In Proceedings of COCOON 2008, volume 5092 of LNCS, pages 276-286. Springer Verlag, 2008. URL: https://doi.org/10.1007/978-3-540-69733-6_28.
  10. David Sankoff. Edit distance for genome comparison based on non-local operations. In Proceedings of CPM 1992, volume 644 of LNCS, pages 121-135, 1992. URL: https://doi.org/10.1007/3-540-56024-6_10.
  11. Eric Tannier, Chunfang Zheng, and David Sankoff. Multichromosomal median and halving problems under different genomic distances. BMC Bioinformatics, 10:120, 2009. URL: https://doi.org/10.1186/1471-2105-10-120.
  12. Sophia Yancopoulos, Oliver Attie, and Richard Friedberg. Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics, 21(16):3340-3346, 2005. URL: https://doi.org/10.1093/bioinformatics/bti535.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail