Weighted Minimum-Length Rearrangement Scenarios

Authors Pijus Simonaitis , Annie Chateau, Krister M. Swenson



PDF
Thumbnail PDF

File

LIPIcs.WABI.2019.13.pdf
  • Filesize: 0.59 MB
  • 17 pages

Document Identifiers

Author Details

Pijus Simonaitis
  • LIRMM - Université Montpellier, France
Annie Chateau
  • LIRMM - Université Montpellier, France
Krister M. Swenson
  • LIRMM, CNRS - Université Montpellier, France

Cite AsGet BibTex

Pijus Simonaitis, Annie Chateau, and Krister M. Swenson. Weighted Minimum-Length Rearrangement Scenarios. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 13:1-13:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.WABI.2019.13

Abstract

We present the first known model of genome rearrangement with an arbitrary real-valued weight function on the rearrangements. It is based on the dominant model for the mathematical and algorithmic study of genome rearrangement, Double Cut and Join (DCJ). Our objective function is the sum or product of the weights of the DCJs in an evolutionary scenario, and the function can be minimized or maximized. If the likelihood of observing an independent DCJ was estimated based on biological conditions, for example, then this objective function could be the likelihood of observing the independent DCJs together in a scenario. We present an O(n⁴)-time dynamic programming algorithm solving the Minimum Cost Parsimonious Scenario (MCPS) problem for co-tailed genomes with n genes (or syntenic blocks). Combining this with our previous work on MCPS yields a polynomial-time algorithm for general genomes. The key theoretical contribution is a novel link between the parsimonious DCJ (or 2-break) scenarios and quadrangulations of a regular polygon. To demonstrate that our algorithm is fast enough to treat biological data, we run it on syntenic blocks constructed for Human paired with Chimpanzee, Gibbon, Mouse, and Chicken. We argue that the Human and Gibbon pair is a particularly interesting model for the study of weighted genome rearrangements.

Subject Classification

ACM Subject Classification
  • Applied computing → Bioinformatics
Keywords
  • Weighted genome rearrangement
  • Double cut and join (DCJ)
  • Edge switch
  • Minimum-weight quadrangulation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A. Amir and A. Levy. String rearrangement metrics: A survey. In Algorithms and Applications, pages 1-33. Springer, 2010. Google Scholar
  2. Y. Anselmetti, W. Duchemin, E. Tannier, C. Chauve, and S. Bérard. Phylogenetic signal from rearrangements in 18 Anopheles species by joint scaffolding extant and ancestral genomes. BMC genomics, 19(2):96, 2018. Google Scholar
  3. Y. Anselmetti, N. Luhmann, S. Bérard, E. Tannier, and C. Chauve. Comparative Methods for Reconstructing Ancient Genome Organization. In Comparative Genomics, pages 343-362. Springer, 2018. Google Scholar
  4. P. Avdeyev, N. Alexeev, Y. Rong, and M.A. Alekseyev. A unified ILP framework for genome median, halving, and aliquoting problems under DCJ. In RECOMB International Workshop on Comparative Genomics, pages 156-178. Springer, 2017. Google Scholar
  5. V. Bafna and P.A. Pevzner. Genome rearrangements and sorting by reversals. SIAM Journal on Computing, 25(2):272-289, 1996. Google Scholar
  6. Y. Baryshnikov. On Stokes sets. In New developments in singularity theory, pages 65-86. Springer, 2001. Google Scholar
  7. C. Baudet, U. Dias, and Z. Dias. Sorting by weighted inversions considering length and symmetry. BMC bioinformatics, 16(19):S3, 2015. Google Scholar
  8. S. Bérard, A. Chateau, C. Chauve, C. Paul, and E. Tannier. Computation of perfect DCJ rearrangement scenarios with linear and circular chromosomes. Journal of Computational Biology, 16(10):1287-1309, 2009. Google Scholar
  9. A. Bergeron, J. Mixtacki, and J. Stoye. A unifying view of genome rearrangements. In International Workshop on Algorithms in Bioinformatics, pages 163-173. Springer, 2006. Google Scholar
  10. C. Berthelot, M. Muffato, J. Abecassis, and H.R. Crollius. The 3D organization of chromatin explains evolutionary fragile genomic regions. Cell reports, 10(11):1913-1924, 2015. Google Scholar
  11. D. Bienstock and O. Günlük. A degree sequence problem related to network design. Networks, 24(4):195-205, 1994. Google Scholar
  12. P. Biller, C. Knibbe, L. Guéguen, and E. Tannier. Breaking good: accounting for the diversity of fragile regions for estimating rearrangement distances. Genome Biol Evol, 8:1427-39, 2016. Google Scholar
  13. M. Blanchette, T. Kunisawa, and D. Sankoff. Parametric genome rearrangement. Gene, 172(1):GC11-GC17, 1996. Google Scholar
  14. L. Bulteau, G. Fertin, and E. Tannier. Genome rearrangements with indels in intergenes restrict the scenario space. BMC bioinformatics, 17(14):426, 2016. Google Scholar
  15. A. Caprara. Sorting permutations by reversals and Eulerian cycle decompositions. SIAM journal on discrete mathematics, 12(1):91-110, 1999. Google Scholar
  16. M.J.P. Chaisson, A.D. Sanders, X. Zhao, A. Malhotra, D. Porubsky, T. Rausch, E.J. Gardner, O.L. Rodriguez, L. Guo, R.L. Collins, et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nature communications, 10, 2019. Google Scholar
  17. P.E.C. Compeau. A generalized cost model for DCJ-indel sorting. In International Workshop on Algorithms in Bioinformatics, pages 38-51. Springer, 2014. Google Scholar
  18. F. Farnoud and O. Milenkovic. Sorting of permutations by cost-constrained transpositions. IEEE Transactions on Information Theory, 58(1):3-23, 2012. Google Scholar
  19. T. Feder, A. Guetz, M. Mihail, and A. Saberi. A local switch Markov chain on given degree graphs with application in connectivity of peer-to-peer networks. In Foundations of Computer Science, 2006. FOCS'06. 47th Annual IEEE Symposium on, pages 69-76. IEEE, 2006. Google Scholar
  20. G. Fertin, G. Jean, and E. Tannier. Algorithms for computing the double cut and join distance on both gene order and intergenic sizes. Algorithms for Molecular Biology, 12(1):16, 2017. Google Scholar
  21. B.K. Fosdick, D.B. Larremore, J. Nishimura, and J. Ugander. Configuring random graph models with fixed degree sequences. SIAM Review, 60(2):315-355, 2018. Google Scholar
  22. G. Fudenberg and K.S. Pollard. Chromatin features constrain structural variation across evolutionary timescales. Proceedings of the National Academy of Sciences, 116(6), 2019. Google Scholar
  23. T. Hartmann, M. Bernt, and M. Middendorf. An exact algorithm for sorting by weighted preserving genome rearrangements. IEEE/ACM transactions on computational biology and bioinformatics, 16(1):52-62, 2019. Google Scholar
  24. T. Hartmann, M. Middendorf, and M. Bernt. Genome Rearrangement Analysis: Cut and Join Genome Rearrangements and Gene Cluster Preserving Approaches. In Comparative Genomics, pages 261-289. Springer, 2018. Google Scholar
  25. T. Hartmann, N. Wieseke, R. Sharan, M. Middendorf, and M. Bernt. Genome rearrangement with ILP. IEEE/ACM transactions on computational biology and bioinformatics, 15(5), 2018. Google Scholar
  26. L. Huynh and F. Hormozdiari. TAD fusion score: discovery and ranking the contribution of deletions to genome structure. Genome biology, 20(1):60, 2019. Google Scholar
  27. N.H. Lazar, K.A. Nevonen, B. O'Connell, C. McCann, R.J. O'Neill, R.E. Green, T.J. Meyer, M. Okhovat, and L. Carbone. Epigenetic maintenance of topological domains in the highly rearranged gibbon genome. Genome research, 28(7):983-997, 2018. Google Scholar
  28. Y. Lin, V. Rajan, and B.M.E. Moret. TIBA: a tool for phylogeny inference from rearrangement data with bootstrap analysis. Bioinformatics, 28(24):3324-3325, 2012. Google Scholar
  29. F.V. Martinez, P. Feijao, M.D.V. Braga, and J. Stoye. On the family-free DCJ distance. In International Workshop on Algorithms in Bioinformatics, pages 174-186. Springer, 2014. Google Scholar
  30. I. Miklós and E. Tannier. Approximating the number of Double Cut-and-Join scenarios. Theoretical Computer Science, 439:30-40, 2012. Google Scholar
  31. A. Ouangraoua and A. Bergeron. Combinatorial structure of genome rearrangements scenarios. Journal of Computational Biology, 17(9):1129-1144, 2010. Google Scholar
  32. S. Pulicani, P. Simonaitis, E. Rivals, and K.M. Swenson. Rearrangement scenarios guided by chromatin structure. In RECOMB International Workshop on Comparative Genomics, pages 141-155. Springer, 2017. Google Scholar
  33. M. Shao, Y. Lin, and B. Moret. An exact algorithm to compute the DCJ distance for genomes with duplicate genes. In International Conference on Research in Computational Molecular Biology, pages 280-292. Springer, 2014. Google Scholar
  34. M. Shao, Y. Lin, and B.M.E. Moret. Sorting genomes with rearrangements and segmental duplications through trajectory graphs. In BMC bioinformatics. BioMed Central, 2013. Google Scholar
  35. P. Simonaitis, A. Chateau, and K.M. Swenson. A General Framework for Genome Rearrangement with Biological Constraints. In RECOMB International conference on Comparative Genomics, pages 49-71. Springer, 2018. Google Scholar
  36. P. Simonaitis and K.M. Swenson. Finding local genome rearrangements. Algorithms for Molecular Biology, 13(1):9, 2018. Google Scholar
  37. P.H. Sudmant, T. Rausch, E.J. Gardner, R.E. Handsaker, A. Abyzov, J. Huddleston, Y. Zhang, K. Ye, G. Jun, M.H. Fritz, et al. An integrated map of structural variation in 2,504 human genomes. Nature, 526(7571):75, 2015. Google Scholar
  38. K.M. Swenson and M. Blanchette. Large-scale mammalian genome rearrangements coincide with chromatin interactions. Bioinformatics, July 2019. Google Scholar
  39. K.M. Swenson, P. Simonaitis, and M. Blanchette. Models and algorithms for genome rearrangement with positional constraints. Algorithms for Molecular Biology, 11(1):13, 2016. Google Scholar
  40. A. Veron, C. Lemaitre, C. Gautier, V. Lacroix, and M.-F. Sagot. Close 3D proximity of evolutionary breakpoints argues for the notion of spatial synteny. BMC Genomics, 12, 2011. Google Scholar
  41. R. Warren and D. Sankoff. Genome halving with double cut and join. Journal of Bioinformatics and Computational Biology, 7(02):357-371, 2009. Google Scholar
  42. S. Yancopoulos, O. Attie, and R. Friedberg. Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics, 21(16):3340-3346, 2005. Google Scholar
  43. R. Zeira and R. Shamir. Sorting cancer karyotypes using double-cut-and-joins, duplications and deletions. Bioinformatics (Oxford, England), 2018. Google Scholar
  44. R. Zeira and R. Shamir. Genome Rearrangement Problems with Single and Multiple Gene Copies: A Review. In Bioinformatics and Phylogenetics, pages 205-241. Springer, 2019. Google Scholar
  45. X. Zeng, M.J. Nesbitt, J. Pei, K. Wang, I.A. Vergara, and N. Chen. OrthoCluster: a new tool for mining synteny blocks and applications in comparative genomics. In Proceedings of the 11th international conference on Extending database technology. ACM, 2008. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail