Sequence Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations

Authors Domenico Cantone, Simone Faro, Arianna Pavone



PDF
Thumbnail PDF

File

LIPIcs.WABI.2020.19.pdf
  • Filesize: 0.65 MB
  • 14 pages

Document Identifiers

Author Details

Domenico Cantone
  • University of Catania, Italy
Simone Faro
  • University of Catania, Italy
Arianna Pavone
  • University of Messina, Italy

Cite AsGet BibTex

Domenico Cantone, Simone Faro, and Arianna Pavone. Sequence Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 172, pp. 19:1-19:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.WABI.2020.19

Abstract

Unbalanced translocations are among the most frequent chromosomal alterations, accounted for 30% of all losses of heterozygosity, a major genetic event causing inactivation of tumor suppressor genes. Despite of their central role in genomic sequence analysis, little attention has been devoted to the problem of matching sequences allowing for this kind of chromosomal alteration. In this paper we investigate the approximate string matching problem when the edit operations are non-overlapping unbalanced translocations of adjacent factors. In particular, we first present a 𝒪(nm³)-time and 𝒪(m²)-space algorithm based on the dynamic-programming approach. Then we improve our first result by designing a second solution which makes use of the Directed Acyclic Word Graph of the pattern. In particular, we show that under the assumptions of equiprobability and independence of characters, our algorithm has a 𝒪(nlog²_{σ} m) average time complexity, for an alphabet of size σ, still maintaining the 𝒪(nm³)-time and the 𝒪(m²)-space complexity in the worst case. To the best of our knowledge this is the first solution in literature for the approximate string matching problem allowing for unbalanced translocations of factors.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Text processing
  • approximate matching
  • inversions
  • sequence matching

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Carlos Eduardo Rodrigues Alves, Alair Pereira do Lago, and Augusto F. Vellozo. Alignment with non-overlapping inversions in o(n^3logn)-time. Electron. Notes Discret. Math., 19:365-371, 2005. URL: https://doi.org/10.1016/j.endm.2005.05.049.
  2. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. Efficient matching of biological sequences allowing for non-overlapping inversions. In Raffaele Giancarlo and Giovanni Manzini, editors, Combinatorial Pattern Matching - 22nd Annual Symposium, CPM 2011, Palermo, Italy, June 27-29, 2011. Proceedings, volume 6661 of Lecture Notes in Computer Science, pages 364-375. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-21458-5_31.
  3. Domenico Cantone, Salvatore Cristofaro, and Simone Faro. Efficient string-matching allowing for non-overlapping inversions. Theor. Comput. Sci., 483:85-95, 2013. URL: https://doi.org/10.1016/j.tcs.2012.06.009.
  4. Domenico Cantone, Simone Faro, and Emanuele Giaquinta. Approximate string matching allowing for inversions and translocations. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2010, Prague, Czech Republic, August 30 - September 1, 2010, pages 37-51. Prague Stringology Club, Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2010. URL: http://www.stringology.org/event/2010/p04.html.
  5. Domenico Cantone, Simone Faro, and Emanuele Giaquinta. Text searching allowing for inversions and translocations of factors. Discret. Appl. Math., 163:247-257, 2014. URL: https://doi.org/10.1016/j.dam.2013.05.016.
  6. Domenico Cantone, Simone Faro, and M. Oguzhan Külekci. An efficient skip-search approach to the order-preserving pattern matching problem. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2015, Prague, Czech Republic, August 24-26, 2015, pages 22-35. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2015. URL: http://www.stringology.org/event/2015/p04.html.
  7. Zhi-Zhong Chen, Yong Gao, Guohui Lin, Robert Niewiadomski, Yang Wang, and Junfeng Wu. A space-efficient algorithm for sequence alignment with inversions and reversals. Theor. Comput. Sci., 325(3):361-372, 2004. URL: https://doi.org/10.1016/j.tcs.2004.02.040.
  8. Da-Jung Cho, Yo-Sub Han, and Hwee Kim. Alignment with non-overlapping inversions and translocations on two strings. Theor. Comput. Sci., 575:90-101, 2015. URL: https://doi.org/10.1016/j.tcs.2014.10.036.
  9. Maxime Crochemore and Wojciech Rytter. Text Algorithms. Oxford University Press, 1994. URL: http://www-igm.univ-mlv.fr/%7Emac/REC/B1.html.
  10. Paul Cull and Tai Hsu. Recent advances in the walking tree method for biological sequence alignment. In Roberto Moreno-Díaz and Franz Pichler, editors, Computer Aided Systems Theory - EUROCAST 2003, 9th International Workshop on Computer Aided Systems Theory, Las Palmas de Gran Canaria, Spain, February 24-28, 2003, Revised Selected Papers, volume 2809 of Lecture Notes in Computer Science, pages 349-359. Springer, 2003. URL: https://doi.org/10.1007/978-3-540-45210-2_32.
  11. Fred Damerau. A technique for computer detection and correction of spelling errors. Commun. ACM, 7(3):171-176, 1964. URL: https://doi.org/10.1145/363958.363994.
  12. Simone Faro and Arianna Pavone. An efficient skip-search approach to swap matching. Comput. J., 61(9):1351-1360, 2018. URL: https://doi.org/10.1093/comjnl/bxx123.
  13. Yong Gao, Junfeng Wu, Robert Niewiadomski, Yang Wang, Zhi-Zhong Chen, and Guohui Lin. A space efficient algorithm for sequence alignment with inversions. In Tandy J. Warnow and Binhai Zhu, editors, Computing and Combinatorics, 9th Annual International Conference, COCOON 2003, Big Sky, MT, USA, July 25-28, 2003, Proceedings, volume 2697 of Lecture Notes in Computer Science, pages 57-67. Springer, 2003. URL: https://doi.org/10.1007/3-540-45071-8_8.
  14. Szymon Grabowski, Simone Faro, and Emanuele Giaquinta. String matching with inversions and translocations in linear average time (most of the time). Inf. Process. Lett., 111(11):516-520, 2011. URL: https://doi.org/10.1016/j.ipl.2011.02.015.
  15. V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10:707-710, 1966. Google Scholar
  16. J.R. Lupski. Genomic disorders: structural features of the genome can lead to dna rearrangements and human disease traits. Trends Genet., 14:417-422, 1998. Google Scholar
  17. M. Carrera andJ. Egozcue M. Oliver-Bonet andJ. Navarro and J. Benet. Aneuploid and unbalanced sperm in two translocation carriers: evaluation of the genetic risk. Mol. Hum. Reprod., 8:958-963, 2002. Google Scholar
  18. Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, 2001. URL: https://doi.org/10.1145/375360.375365.
  19. H. Ogiwara, T. Kohno andH. Nakanishi, K. Nagayama, M. Sato, and J. Yokota. Unbalanced translocation, a major chromosome alteration causing loss of heterozygosity in human lung cancer. Oncogene, 27:4788-4797, 2008. Google Scholar
  20. M. Schöniger and M. Waterman. A local algorithm for dna sequence alignment with inversions. Bulletin of Mathematical Biology, 54:521-536, 1992. Google Scholar
  21. Augusto F. Vellozo, Carlos E. R. Alves, and Alair Pereira do Lago. Alignment with non-overlapping inversions in O(n^3)-time. In Philipp Bucher and Bernard M. E. Moret, editors, Algorithms in Bioinformatics, 6th International Workshop, WABI 2006, Zurich, Switzerland, September 11-13, 2006, Proceedings, volume 4175 of Lecture Notes in Computer Science, pages 186-196. Springer, 2006. URL: https://doi.org/10.1007/11851561_18.
  22. D. Warburton. De novo balanced chromosome rearrangements and extra marker chromosomes identified at prenatal diagnosis: clinical significance and distribution of breakpoints. Am J. Hum Genet, 49:995-1013, 1991. Google Scholar
  23. B. Weckselblatt, K. E. Hermetz, and M.K. Rudd. Unbalanced translocations arise from diverse mutational mechanisms including chromothripsis. Genome Research, 25:937-947, 2015. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail