Beyond Adjacency Maximization: Scaffold Filling for New String Distances

Authors Laurent Bulteau, Guillaume Fertin, Christian Komusiewicz



PDF
Thumbnail PDF

File

LIPIcs.CPM.2017.27.pdf
  • Filesize: 0.53 MB
  • 17 pages

Document Identifiers

Author Details

Laurent Bulteau
Guillaume Fertin
Christian Komusiewicz

Cite AsGet BibTex

Laurent Bulteau, Guillaume Fertin, and Christian Komusiewicz. Beyond Adjacency Maximization: Scaffold Filling for New String Distances. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 27:1-27:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.CPM.2017.27

Abstract

In Genomic Scaffold Filling, one aims at polishing in silico a draft genome, called scaffold. The scaffold is given in the form of an ordered set of gene sequences, called contigs. This is done by confronting the scaffold to an already complete reference genome from a close species. More precisely, given a scaffold S, a reference genome G and a score function f() between two genomes, the aim is to complete S by adding the missing genes from G so that the obtained complete genome S* optimizes f(S*, G). In this paper, we extend a model of Jiang et al. [CPM 2016] (i) by allowing the insertions of strings instead of single characters (i.e., some groups of genes may be forced to be inserted together) and (ii) by considering two alternative score functions: the first generalizes the notion of common adjacencies by maximizing the number of common k-mers between S* and G (k-Mer Scaffold Filling), the second aims at minimizing the number of breakpoints between S* and G (Min-Breakpoint Scaffold Filling). We study these problems from the parameterized complexity point of view, providing fixed-parameter (FPT) algorithms for both problems. In particular, we show that k-Mer Scaffold Filling is FPT wrt. parameter l, the number of additional k-mers realized by the completion of S—this answers an open question of Jiang et al. [CPM 2016]. We also show that Min-Breakpoint Scaffold Filling is FPT wrt. a parameter combining the number of missing genes, the number of gene repetitions and the target distance.
Keywords
  • computational biology
  • strings
  • FPT algorithms
  • kernelization

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. J. ACM, 42(4):844-856, 1995. URL: http://dx.doi.org/10.1145/210332.210337.
  2. Markus Bläser. Computing small partial coverings. Inf. Process. Lett., 85(6):327-331, 2003. URL: http://dx.doi.org/10.1016/S0020-0190(02)00434-9.
  3. Laurent Bulteau, Anna Paola Carrieri, and Riccardo Dondi. Fixed-parameter algorithms for scaffold filling. Theor. Comput. Sci., 568:72-83, 2015. URL: http://dx.doi.org/10.1016/j.tcs.2014.12.005.
  4. Laurent Bulteau, Guillaume Fertin, Christian Komusiewicz, and Irena Rusu. A fixed-parameter algorithm for minimum common string partition with few duplications. In Aaron E. Darling and Jens Stoye, editors, Proceedings of the 13th International Workshop on Algorithms in Bioinformatics (WABI 2013), volume 8126 of LNCS, pages 244-258. Springer, 2013. URL: http://dx.doi.org/10.1007/978-3-642-40453-5_19.
  5. Laurent Bulteau, Falk Hüffner, Christian Komusiewicz, and Rolf Niedermeier. Multivariate algorithmics for NP-hard string problems. Bull. EATCS, 114, 2014. URL: http://eatcs.org/beatcs/index.php/beatcs/article/view/310.
  6. Laurent Bulteau and Christian Komusiewicz. Minimum common string partition parameterized by partition size is fixed-parameter tractable. In Chandra Chekuri, editor, Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2014), pages 102-121. SIAM, 2014. URL: http://dx.doi.org/10.1137/1.9781611973402.8.
  7. Xin Chen, Jie Zheng, Zheng Fu, Peng Nan, Yang Zhong, Stefano Lonardi, and Tao Jiang. Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biol. Bioinform., 2(4):302-315, October 2005. URL: http://dx.doi.org/10.1109/TCBB.2005.48.
  8. Marek Cygan, Fedor V. Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-21275-3.
  9. Michael Dom, Daniel Lokshtanov, and Saket Saurabh. Kernelization lower bounds through colors and IDs. ACM Trans. Algorithms, 11(2):13:1-13:20, 2014. URL: http://dx.doi.org/10.1145/2650261.
  10. Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013. URL: http://dx.doi.org/10.1007/978-1-4471-5559-1.
  11. Avraham Goldstein, Petr Kolman, and Jie Zheng. Minimum common string partition problem: Hardness and approximations. Electron. J. Comb., 12, 2005. URL: http://www.combinatorics.org/Volume_12/Abstracts/v12i1r50.html.
  12. Jiong Guo, Rolf Niedermeier, and Sebastian Wernicke. Parameterized complexity of vertex cover variants. Theory Comput. Syst., 41(3):501-520, 2007. URL: http://dx.doi.org/10.1007/s00224-007-1309-3.
  13. Haitao Jiang, Chenglin Fan, Boting Yang, Farong Zhong, Daming Zhu, and Binhai Zhu. Genomic scaffold filling revisited. In Roberto Grossi and Moshe Lewenstein, editors, Proceedings of the 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), volume 54 of LIPIcs, pages 15:1-15:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.CPM.2016.15.
  14. Haitao Jiang, Jingjing Ma, Junfeng Luan, and Daming Zhu. Approximation and nonapproximability for the one-sided scaffold filling problem. In Dachuan Xu, Donglei Du, and Ding-Zhu Du, editors, Proceedings of the 21st International Conference on Computing and Combinatorics (COCOON 2015), volume 9198 of LNCS, pages 251-263. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-21398-9_20.
  15. Haitao Jiang, Chunfang Zheng, David Sankoff, and Binhai Zhu. Scaffold filling under the breakpoint distance. In Eric Tannier, editor, Proceedings of the International Workshop on Comparative Genomics (RECOMB-CG 2010), volume 6398 of LNCS, pages 83-92. Springer, 2010. URL: http://dx.doi.org/10.1007/978-3-642-16181-0_8.
  16. Haitao Jiang, Chunfang Zheng, David Sankoff, and Binhai Zhu. Scaffold filling under the breakpoint and related distances. IEEE/ACM Trans. Comput. Biol. Bioinform., 9(4):1220-1229, 2012. URL: http://dx.doi.org/10.1109/TCBB.2012.57.
  17. Haitao Jiang, Farong Zhong, and Binhai Zhu. Filling scaffolds with gene repetitions: Maximizing the number of adjacencies. In Raffaele Giancarlo and Giovanni Manzini, editors, Proceedings of the 22nd Annual Symposium on Combinatorial Pattern Matching (CPM 2011), volume 6661 of LNCS, pages 55-64. Springer, 2011. URL: http://dx.doi.org/10.1007/978-3-642-21458-5_7.
  18. Haitao Jiang, Binhai Zhu, Daming Zhu, and Hong Zhu. Minimum common string partition revisited. J. Comb. Optim., 23(4):519-527, 2012. URL: http://dx.doi.org/10.1007/s10878-010-9370-2.
  19. Nan Liu, Peng Zou, and Binhai Zhu. A polynomial time solution for permutation scaffold filling. In T.-H. Hubert Chan, Minming Li, and Lusheng Wang, editors, Proceedings of the 10th International Conference on Combinatorial Optimization and Applications (COCOA 2016), volume 10043 of LNCS, pages 782-789. Springer, 2016. URL: http://dx.doi.org/10.1007/978-3-319-48749-6_60.
  20. Adriana Muñoz, Chunfang Zheng, Qian Zhu, Victor A. Albert, Steve Rounsley, and David Sankoff. Scaffold filling, contig fusion and comparative gene order inference. BMC Bioinformatics, 11:304, 2010. URL: http://dx.doi.org/10.1186/1471-2105-11-304.
  21. Binhai Zhu. Genomic scaffold filling: A progress report. In Daming Zhu and Sergey Bereg, editors, Proceedings of the 10th International Workshop on Frontiers in Algorithmics (FAW 2016), volume 9711 of LNCS, pages 8-16. Springer, 2016. URL: http://dx.doi.org/10.1007/978-3-319-39817-4_2.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail