Search Results

Documents authored by Barak, Parvesh


Document
Haplotype-Aware Long-Read Error Correction

Authors: Parvesh Barak, Daniel Gibney, and Chirag Jain

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Error correction of long reads is an important initial step in genome assembly workflows. For organisms with ploidy greater than one, it is important to preserve haplotype-specific variation during read correction. This challenge has driven the development of several haplotype-aware correction methods. However, existing methods are based on either ad-hoc heuristics or deep learning approaches. In this paper, we introduce a rigorous formulation for this problem. Our approach builds on the minimum error correction framework used in reference-based haplotype phasing. We prove that the proposed formulation for error correction of reads in de novo context, i.e., without using a reference genome, is NP-hard. To make our exact algorithm scale to large datasets, we introduce practical heuristics. Experiments using PacBio HiFi sequencing datasets from human and plant genomes show that our approach achieves accuracy comparable to state-of-the-art methods. The software is freely available at https://github.com/at-cg/HALE.

Cite as

Parvesh Barak, Daniel Gibney, and Chirag Jain. Haplotype-Aware Long-Read Error Correction. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{barak_et_al:LIPIcs.WABI.2025.4,
  author =	{Barak, Parvesh and Gibney, Daniel and Jain, Chirag},
  title =	{{Haplotype-Aware Long-Read Error Correction}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{4:1--4:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.4},
  URN =		{urn:nbn:de:0030-drops-239300},
  doi =		{10.4230/LIPIcs.WABI.2025.4},
  annote =	{Keywords: Genome assembly, phasing, clustering, overlap graph, consensus}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail