3 Search Results for "Li, Bohan"


Document
Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests

Authors: Dominika Draesslerová, Omar Ahmed, Travis Gagie, Jan Holub, Ben Langmead, Giovanni Manzini, and Gonzalo Navarro

Published in: LIPIcs, Volume 301, 22nd International Symposium on Experimental Algorithms (SEA 2024)


Abstract
For taxonomic classification, we are asked to index the genomes in a phylogenetic tree such that later, given a DNA read, we can quickly choose a small subtree likely to contain the genome from which that read was drawn. Although popular classifiers such as Kraken use k-mers, recent research indicates that using maximal exact matches (MEMs) can lead to better classifications. For example, we can - build an augmented FM-index over the the genomes in the tree concatenated in left-to-right order; - for each MEM in a read, find the interval in the suffix array containing the starting positions of that MEM’s occurrences in those genomes; - find the minimum and maximum values stored in that interval; - take the lowest common ancestor (LCA) of the genomes containing the characters at those positions. This solution is practical, however, only when the total size of the genomes in the tree is fairly small. In this paper we consider applying the same solution to three lossily compressed representations of the genomes' concatenation: - a KATKA kernel, which discards characters that are not in the first or last occurrence of any k_max-tuple, for a parameter k_max; - a minimizer digest; - a KATKA kernel of a minimizer digest. With a test dataset and these three representations of it, simulated reads and various parameter settings, we checked how many reads' longest MEMs occurred only in the sequences from which those reads were generated ("true positive" reads). For some parameter settings we achieved significant compression while only slightly decreasing the true-positive rate.

Cite as

Dominika Draesslerová, Omar Ahmed, Travis Gagie, Jan Holub, Ben Langmead, Giovanni Manzini, and Gonzalo Navarro. Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 10:1-10:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{draesslerova_et_al:LIPIcs.SEA.2024.10,
  author =	{Draesslerov\'{a}, Dominika and Ahmed, Omar and Gagie, Travis and Holub, Jan and Langmead, Ben and Manzini, Giovanni and Navarro, Gonzalo},
  title =	{{Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{10:1--10:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.10},
  URN =		{urn:nbn:de:0030-drops-203756},
  doi =		{10.4230/LIPIcs.SEA.2024.10},
  annote =	{Keywords: Taxonomic classification, metagenomics, KATKA, maximal exact matches, string kernels, minimizer digests}
}
Document
Track A: Algorithms, Complexity and Games
Non-Linear Paging

Authors: Ilan Doron-Arad and Joseph (Seffi) Naor

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)


Abstract
We formulate and study non-linear paging - a broad model of online paging where the size of subsets of pages is determined by a monotone non-linear set function of the pages. This model captures the well-studied classic weighted paging and generalized paging problems, and also submodular and supermodular paging, studied here for the first time, that have a range of applications from virtual memory to machine learning. Unlike classic paging, the cache threshold parameter k does not yield good competitive ratios for non-linear paging. Instead, we introduce a novel parameter 𝓁 that generalizes the notion of cache size to the non-linear setting. We obtain a tight deterministic 𝓁-competitive algorithm for general non-linear paging and a o(log²𝓁)-competitive lower bound for randomized algorithms. Our algorithm is based on a new generic LP for the problem that captures both submodular and supermodular paging, in contrast to LPs used for submodular cover settings. We finally focus on the supermodular paging problem, which is a variant of online set cover and online submodular cover, where sets are repeatedly requested to be removed from the cover. We obtain polylogarithmic lower and upper bounds and an offline approximation algorithm.

Cite as

Ilan Doron-Arad and Joseph (Seffi) Naor. Non-Linear Paging. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 57:1-57:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{doronarad_et_al:LIPIcs.ICALP.2024.57,
  author =	{Doron-Arad, Ilan and Naor, Joseph (Seffi)},
  title =	{{Non-Linear Paging}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{57:1--57:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.57},
  URN =		{urn:nbn:de:0030-drops-202000},
  doi =		{10.4230/LIPIcs.ICALP.2024.57},
  annote =	{Keywords: paging, competitive analysis, non-linear paging, submodular and supermodular functions}
}
Document
Improving Local Search for Minimum Weighted Connected Dominating Set Problem by Inner-Layer Local Search

Authors: Bohan Li, Kai Wang, Yiyuan Wang, and Shaowei Cai

Published in: LIPIcs, Volume 210, 27th International Conference on Principles and Practice of Constraint Programming (CP 2021)


Abstract
The minimum weighted connected dominating set (MWCDS) problem is an important variant of connected dominating set problems with wide applications, especially in heterogenous networks and gene regulatory networks. In the paper, we develop a nested local search algorithm called NestedLS for solving MWCDS on classic benchmarks and massive graphs. In this local search framework, we propose two novel ideas to make it effective by utilizing previous search information. First, we design the restart based smoothing mechanism as a diversification method to escape from local optimal. Second, we propose a novel inner-layer local search method to enlarge the candidate removal set, which can be modelled as an optimized version of spanning tree problem. Moreover, inner-layer local search method is a general method for maintaining the connectivity constraint when dealing with massive graphs. Experimental results show that NestedLS outperforms state-of-the-art meta-heuristic algorithms on most instances.

Cite as

Bohan Li, Kai Wang, Yiyuan Wang, and Shaowei Cai. Improving Local Search for Minimum Weighted Connected Dominating Set Problem by Inner-Layer Local Search. In 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 210, pp. 39:1-39:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.CP.2021.39,
  author =	{Li, Bohan and Wang, Kai and Wang, Yiyuan and Cai, Shaowei},
  title =	{{Improving Local Search for Minimum Weighted Connected Dominating Set Problem by Inner-Layer Local Search}},
  booktitle =	{27th International Conference on Principles and Practice of Constraint Programming (CP 2021)},
  pages =	{39:1--39:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-211-2},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{210},
  editor =	{Michel, Laurent D.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2021.39},
  URN =		{urn:nbn:de:0030-drops-153304},
  doi =		{10.4230/LIPIcs.CP.2021.39},
  annote =	{Keywords: Operations Research, NP-hard Problem, Local Search, Weighted Connected Dominating Set Problem}
}
  • Refine by Author
  • 1 Ahmed, Omar
  • 1 Cai, Shaowei
  • 1 Doron-Arad, Ilan
  • 1 Draesslerová, Dominika
  • 1 Gagie, Travis
  • Show More...

  • Refine by Classification
  • 1 Applied computing → Operations research
  • 1 Theory of computation
  • 1 Theory of computation → Pattern matching
  • 1 Theory of computation → Randomized local search

  • Refine by Keyword
  • 1 KATKA
  • 1 Local Search
  • 1 NP-hard Problem
  • 1 Operations Research
  • 1 Taxonomic classification
  • Show More...

  • Refine by Type
  • 3 document

  • Refine by Publication Year
  • 2 2024
  • 1 2021