Inferring Temporally Consistent Migration Histories

Authors Mrinmoy Saha Roddur , Sagi Snir , Mohammed El-Kebir

Thumbnail PDF


  • Filesize: 1.31 MB
  • 22 pages

Document Identifiers

Author Details

Mrinmoy Saha Roddur
  • Department of Computer Science, University of Illinois at Urbana-Champaign, IL, USA
Sagi Snir
  • Department of Evolutionary Biology, University of Haifa, Israel
Mohammed El-Kebir
  • Department of Computer Science, University of Illinois at Urbana-Champaign, IL, USA
  • Cancer Center at Illinois, University of Illinois at Urbana-Champaign, IL, USA


This project started as a collaboration at the Computational Genomics Summer Institute 2022.

Cite AsGet BibTex

Mrinmoy Saha Roddur, Sagi Snir, and Mohammed El-Kebir. Inferring Temporally Consistent Migration Histories. In 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 273, pp. 9:1-9:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Not only do many biological populations undergo evolution, but population members may also migrate from one location to another. For example, tumor cells may migrate from the primary tumor and seed a new metastasis, and pathogens may migrate from one host to another. One may represent a population’s migration history by labeling the vertices of a given phylogeny T with locations such that an edge incident to vertices with distinct locations represents a migration. Additionally, in some biological populations, taxa from distinct lineages may comigrate from one location to another in a single event, a phenomenon known as a comigration. Here, we show that a previous problem statement for inferring migration histories that are parsimonious in terms of migrations and comigrations may lead to temporally inconsistent solutions. To remedy this deficiency, we introduce precise definitions of temporal consistency of comigrations in a phylogeny, leading to three successive problems. First, we formulate the Temporally Consistent Comigrations (TCC) problem to check if a set of comigrations is temporally consistent and provide a linear time algorithm for solving this problem. Second, we formulate the Parsimonious Consistent Comigration (PCC) problem, which aims to find comigrations given a location labeling of a phylogeny. We show that PCC is NP-hard. Third, we formulate the Parsimonious Consistent Comigration History (PCCH) problem, which infers the migration history given a phylogeny and locations of its extant vertices only. We show that PCCH is NP-hard as well. On the positive side, we propose integer linear programming models to solve the PCC and PCCH problems. We apply our approach to real and simulated data.

Subject Classification

ACM Subject Classification
  • Applied computing → Computational biology
  • Metastasis
  • Migration
  • Integer Linear Programming
  • Maximum parsimony


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Nicola Aceto, Aditya Bardia, David T Miyamoto, Maria C Donaldson, Ben S Wittner, Joel A Spencer, Min Yu, Adam Pely, Amanda Engstrom, Huili Zhu, et al. Circulating tumor cell clusters are oligoclonal precursors of breast cancer metastasis. Cell, 158(5):1110-1122, 2014. Google Scholar
  2. Nicolai J Birkbak and Nicholas McGranahan. Cancer genome evolutionary trajectories in metastasis. Cancer cell, 37(1):8-19, 2020. Google Scholar
  3. Finlay Campbell, Anne Cori, Neil Ferguson, and Thibaut Jombart. Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data. PLoS computational biology, 15(3):e1006930, 2019. Google Scholar
  4. Cédric Chauve, Akbar Rafiey, Adrian A Davin, Celine Scornavacca, Philippe Veber, Bastien Boussau, Gergely J Szöllősi, Vincent Daubin, and Eric Tannier. MaxTiC: Fast ranking of a phylogenetic tree by maximum time consistency with lateral gene transfers. bioRxiv, page 127548, 2017. Google Scholar
  5. Kevin J Cheung and Andrew J Ewald. A collective route to metastasis: Seeding by tumor cell clusters. Science, 352(6282):167-169, 2016. Google Scholar
  6. Kevin J Cheung, Veena Padmanaban, Vanesa Silvestri, Koen Schipper, Joshua D Cohen, Amanda N Fairchild, Michael A Gorin, James E Verdone, Kenneth J Pienta, Joel S Bader, et al. Polyclonal breast cancer metastases arise from collective dissemination of keratin 14-expressing tumor cell clusters. Proceedings of the National Academy of Sciences, 113(7):E854-E863, 2016. Google Scholar
  7. Elizabeth Comen, Larry Norton, and Joan Massague. Clinical implications of cancer self-seeding. Nature reviews Clinical oncology, 8(6):369-377, 2011. Google Scholar
  8. Maya Dadiani, Vyacheslav Kalchenko, Ady Yosepovich, Raanan Margalit, Yaron Hassid, Hadassa Degani, and Dalia Seger. Real-time imaging of lymphogenic metastasis in orthotopic human breast cancer. Cancer research, 66(16):8037-8041, 2006. Google Scholar
  9. Lawrence A David and Eric J Alm. Rapid evolutionary innovation during an archaean genetic expansion. Nature, 469(7328):93-96, 2011. Google Scholar
  10. Simon Dellicour, Guy Baele, Gytis Dudas, Nuno R Faria, Oliver G Pybus, Marc A Suchard, Andrew Rambaut, and Philippe Lemey. Phylodynamic assessment of intervention strategies for the West African Ebola virus outbreak. Nature communications, 9(1):2222, 2018. Google Scholar
  11. Mohammed El-Kebir, Gryte Satas, and Benjamin J Raphael. Inferring parsimonious migration histories for metastatic cancers. Nature genetics, 50(5):718-726, 2018. Google Scholar
  12. Mark B Faries, Shawn Steen, Xing Ye, Myung Sim, and Donald L Morton. Late recurrence in melanoma: clinical implications of lost dormancy. Journal of the American College of Surgeons, 217(1):27-34, 2013. Google Scholar
  13. Ousmane Faye, Pierre-Yves Boëlle, Emmanuel Heleze, Oumar Faye, Cheikh Loucoubar, N'Faly Magassouba, Barré Soropogui, Sakoba Keita, Tata Gakou, Lamine Koivogui, et al. Chains of transmission and control of Ebola virus disease in Conakry, Guinea, in 2014: an observational study. The Lancet Infectious Diseases, 15(3):320-326, 2015. Google Scholar
  14. Neil M Ferguson, Christl A Donnelly, and Roy M Anderson. Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain. Nature, 413(6855):542-548, 2001. Google Scholar
  15. Gunes Gundem, Peter Van Loo, Barbara Kremeyer, Ludmil B Alexandrov, Jose MC Tubio, Elli Papaemmanuil, Daniel S Brewer, Heini ML Kallio, Gunilla Högnäs, Matti Annala, et al. The evolutionary history of lethal metastatic prostate cancer. Nature, 520(7547):353-357, 2015. Google Scholar
  16. Arthur B Kahn. Topological sorting of large networks. Communications of the ACM, 5(11):558-562, 1962. Google Scholar
  17. Sau Yee Kok, Hiroko Oshima, Kei Takahashi, Mizuho Nakayama, Kazuhiro Murakami, Hiroki R Ueda, Kohei Miyazono, and Masanobu Oshima. Malignant subclone drives metastasis of genetically and phenotypically heterogenous cell clusters through fibrotic niche generation. Nature communications, 12(1):863, 2021. Google Scholar
  18. Manuel Lafond and Marc Hellmuth. Reconstruction of time-consistent species trees. Algorithms for Molecular Biology, 15(1):1-27, 2020. Google Scholar
  19. Ran Libeskind-Hadas and Michael A Charleston. On the computational complexity of the reticulate cophylogeny reconstruction problem. Journal of Computational Biology, 16(1):105-117, 2009. Google Scholar
  20. Ravikanth Maddipati and Ben Z Stanger. Pancreatic cancer metastases harbor evidence of polyclonality. Cancer discovery, 5(10):1086-1097, 2015. Google Scholar
  21. Dena Marrinucci, Kelly Bethel, Anand Kolatkar, Madelyn S Luttgen, Michael Malchiodi, Franziska Baehring, Katharina Voigt, Daniel Lazar, Jorge Nieva, Lyudmila Bazhenova, et al. Fluid biopsy in patients with metastatic prostate, pancreatic and breast cancers. Physical biology, 9(1):016003, 2012. Google Scholar
  22. Andrew McPherson, Andrew Roth, Emma Laks, Tehmina Masud, Ali Bashashati, Allen W Zhang, Gavin Ha, Justina Biele, Damian Yap, Adrian Wan, et al. Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer. Nature genetics, 48(7):758-767, 2016. Google Scholar
  23. Daniel Merkle and Martin Middendorf. Reconstruction of the cophylogenetic history of related phylogenetic trees with divergence timing information. Theory in Biosciences, 123:277-299, 2005. Google Scholar
  24. Nikolai Nøjgaard, Manuela Geiß, Daniel Merkle, Peter F Stadler, Nicolas Wieseke, and Marc Hellmuth. Time-consistent reconciliation maps and forbidden time travel. Algorithms for Molecular Biology, 13(1):1-17, 2018. Google Scholar
  25. Kari-Jouko Räihä and Esko Ukkonen. The shortest common supersequence problem over binary alphabet is NP-complete. Theoretical Computer Science, 16(2):187-198, 1981. URL:
  26. J Zachary Sanborn, Jongsuk Chung, Elizabeth Purdom, Nicholas J Wang, Hojabr Kakavand, James S Wilmott, Timothy Butler, John F Thompson, Graham J Mann, Lauren E Haydu, et al. Phylogenetic analyses of melanoma reveal complex patterns of metastatic dissemination. Proceedings of the National Academy of Sciences, 112(35):10995-11000, 2015. Google Scholar
  27. Palash Sashittal and Mohammed El-Kebir. SharpTNI: Counting and sampling parsimonious transmission networks under a weak bottleneck. bioRxiv, page 842237, 2019. Google Scholar
  28. Palash Sashittal and Mohammed El-Kebir. Sampling and summarizing transmission trees with multi-strain infections. Bioinformatics, 36(Supplement_1):i362-i370, 2020. Google Scholar
  29. Montgomery Slatkin and Wayne P Maddison. A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics, 123(3):603-613, 1989. Google Scholar
  30. Ashley Sobel Leonard, Daniel B Weissman, Benjamin Greenbaum, Elodie Ghedin, and Katia Koelle. Transmission bottleneck size estimation from pathogen deep-sequencing data, with an application to human influenza A virus. Journal of virology, 91(14):e00171-17, 2017. Google Scholar
  31. Jason A Somarelli, Kathryn E Ware, Rumen Kostadinov, Jeffrey M Robinson, Hakima Amri, Mones Abu-Asab, Nicolaas Fourie, Rui Diogo, David Swofford, and Jeffrey P Townsend. Phylooncology: Understanding cancer through phylogenetic analysis. Biochimica et Biophysica Acta (BBA)-Reviews on Cancer, 1867(2):101-108, 2017. Google Scholar
  32. Enea Spada, Luciano Sagliocca, John Sourdis, Anna Rosa Garbuglia, Vincenzo Poggi, Carmela De Fusco, and Alfonso Mele. Use of the minimum spanning tree model for molecular epidemiological investigation of a nosocomial outbreak of hepatitis C virus infection. Journal of clinical microbiology, 42(9):4230-4236, 2004. Google Scholar
  33. Doris P Tabassum and Kornelia Polyak. Tumorigenesis: it takes a village. Nature Reviews Cancer, 15(8):473-483, 2015. Google Scholar
  34. Ali Tofigh, Michael Hallett, and Jens Lagergren. Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM transactions on computational biology and bioinformatics, 8(2):517-535, 2010. Google Scholar
  35. Ami Yamamoto, Andrea E Doak, and Kevin J Cheung. Orchestration of collective migration and metastasis by tumor cell clusters. Annual Review of Pathology: Mechanisms of Disease, 18:231-256, 2023. Google Scholar
  36. Min Yu, Aditya Bardia, Ben S Wittner, Shannon L Stott, Malgorzata E Smas, David T Ting, Steven J Isakoff, Jordan C Ciciliano, Marissa N Wells, Ajay M Shah, et al. Circulating breast tumor cells exhibit dynamic changes in epithelial and mesenchymal composition. science, 339(6119):580-584, 2013. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail