Genomic Scaffold Filling Revisited

Jiang, Haitao; Fan, Chenglin; Yang, Boting; Zhong, Farong; Zhu, Daming; Zhu, Binhai

doi:10.4230/LIPIcs.CPM.2016.15

File

Author Details

Haitao Jiang

Chenglin Fan

Boting Yang

Farong Zhong

Daming Zhu

Binhai Zhu

Cite As Get BibTex

Haitao Jiang, Chenglin Fan, Boting Yang, Farong Zhong, Daming Zhu, and Binhai Zhu. Genomic Scaffold Filling Revisited. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 15:1-15:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.CPM.2016.15

Abstract

The genomic scaffold filling problem has attracted a lot of attention recently. The problem is on filling an incomplete sequence (scaffold) I into I', with respect to a complete reference genome G, such that the number of adjacencies between G and I' is maximized. The problem is NP-complete and APX-hard, and admits a 1.2-approximation. However, the sequence input I is not quite practical and does not fit most of the real datasets (where a scaffold is more often given as a list of contigs). In this paper, we revisit the genomic scaffold filling problem by considering this important case when, (1) a scaffold S is given, the missing genes X = c(G) - c(S) can only be inserted in between the contigs, and the objective is to maximize the number of adjacencies between G and the filled S' and (2) a scaffold S is given, a subset of the missing genes X' subset X = c(G) - c(S) can only be inserted in between the contigs, and the objective is still to maximize the number of adjacencies between G and the filled S''. For problem (1), we present a simple NP-completeness proof, we then present a factor-2 greedy approximation algorithm, and finally we show that the problem is FPT when each gene appears at most d times in G. For problem (2), we prove that the problem is W[1]-hard and then we present a factor-2 FPT-approximation for the case when each gene appears at most d times in G.

Subject Classification

Keywords

Computational biology
Approximation algorithms
FPT algorithms
NP- completeness

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Genomic Scaffold Filling Revisited

Authors Haitao Jiang, Chenglin Fan, Boting Yang, Farong Zhong, Daming Zhu, Binhai Zhu

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

Keywords

Metrics

Thanks for your feedback!

Could not send message