4 Search Results for "Liu, Ding"


Document
Efficient Parallel Output-Sensitive Edit Distance

Authors: Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
In this paper, we study efficient parallel edit distance algorithms, both in theory and in practice. Given two strings A[1..n] and B[1..m], and a set of operations allowed to edit the strings, the edit distance between A and B is the minimum number of operations required to transform A into B. In this paper, we use edit distance to refer to the Levenshtein distance, which allows for unit-cost single-character edits (insertions, deletions, substitutions). Sequentially, a standard Dynamic Programming (DP) algorithm solves edit distance with Θ(nm) cost. In many real-world applications, the strings to be compared are similar to each other and have small edit distances. To achieve highly practical implementations, we focus on output-sensitive parallel edit-distance algorithms, i.e., to achieve asymptotically better cost bounds than the standard Θ(nm) algorithm when the edit distance is small. We study four algorithms in the paper, including three algorithms based on Breadth-First Search (BFS), and one algorithm based on Divide-and-Conquer (DaC). Our BFS-based solution is based on the Landau-Vishkin algorithm. We implement three different data structures for the longest common prefix (LCP) queries needed in the algorithm: the classic solution using parallel suffix array, and two hash-based solutions proposed in this paper. Our DaC-based solution is inspired by the output-insensitive solution proposed by Apostolico et al., and we propose a non-trivial adaption to make it output-sensitive. All of the algorithms studied in this paper have good theoretical guarantees, and they achieve different tradeoffs between work (total number of operations), span (longest dependence chain in the computation), and space. We test and compare our algorithms on both synthetic data and real-world data, including DNA sequences, Wikipedia texts, GitHub repositories, etc. Our BFS-based algorithms outperform the existing parallel edit-distance implementation in ParlayLib in all test cases. On cases with fewer than 10⁵ edits, our algorithm can process input sequences of size 10⁹ in about ten seconds, while ParlayLib can only process sequences of sizes up to 10⁶ in the same amount of time. By comparing our algorithms, we also provide a better understanding of the choice of algorithms for different input patterns. We believe that our paper is the first systematic study in the theory and practice of parallel edit distance.

Cite as

Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun. Efficient Parallel Output-Sensitive Edit Distance. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 40:1-40:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ESA.2023.40,
  author =	{Ding, Xiangyun and Dong, Xiaojun and Gu, Yan and Liu, Youzhe and Sun, Yihan},
  title =	{{Efficient Parallel Output-Sensitive Edit Distance}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{40:1--40:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.40},
  URN =		{urn:nbn:de:0030-drops-186935},
  doi =		{10.4230/LIPIcs.ESA.2023.40},
  annote =	{Keywords: Edit Distance, Parallel Algorithms, String Algorithms, Dynamic Programming, Pattern Matching}
}
Document
On Geometric Prototype and Applications

Authors: Hu Ding and Manni Liu

Published in: LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)


Abstract
In this paper, we propose to study a new geometric optimization problem called the "geometric prototype" in Euclidean space. Given a set of patterns, where each pattern is represented by a (weighted or unweighted) point set, the geometric prototype can be viewed as the "average pattern" minimizing the total matching cost to them. As a general model, the problem finds many applications in real-world, such as Wasserstein barycenter and ensemble clustering. The dimensionality could be either constant or high, depending on the applications. To our best knowledge, the general geometric prototype problem has yet to be seriously considered by the theory community. To bridge the gap between theory and practice, we first show that a small core-set can be obtained to substantially reduce the data size. Consequently, any existing heuristic or algorithm can run on the core-set to achieve a great improvement on the efficiency. As a new application of core-set, it needs to tackle a couple of challenges particularly in theory. Finally, we test our method on both image and high dimensional clustering datasets; the experimental results remain stable even if we run the algorithms on core-sets much smaller than the original datasets, while the running times are reduced significantly.

Cite as

Hu Ding and Manni Liu. On Geometric Prototype and Applications. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 23:1-23:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ESA.2018.23,
  author =	{Ding, Hu and Liu, Manni},
  title =	{{On Geometric Prototype and Applications}},
  booktitle =	{26th Annual European Symposium on Algorithms (ESA 2018)},
  pages =	{23:1--23:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-081-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{112},
  editor =	{Azar, Yossi and Bast, Hannah and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.23},
  URN =		{urn:nbn:de:0030-drops-94867},
  doi =		{10.4230/LIPIcs.ESA.2018.23},
  annote =	{Keywords: prototype, core-set, Wasserstein barycenter, ensemble clustering}
}
Document
Distributed and Robust Support Vector Machine

Authors: Yangwei Liu, Hu Ding, Ziyun Huang, and Jinhui Xu

Published in: LIPIcs, Volume 64, 27th International Symposium on Algorithms and Computation (ISAAC 2016)


Abstract
In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in R^d space) of SVM are arbitrarily distributed among k nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed (1-epsilon)-approximation algorithm to reach this lower bound, where epsilon is a user specified small constant. For distributed SVM with outliers, we present a (1-epsilon)-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top t selection algorithm with communication complexity of O(k log (t)) in the coordinator model. Experimental results on benchmark datasets confirm the theoretical guarantees of our algorithms.

Cite as

Yangwei Liu, Hu Ding, Ziyun Huang, and Jinhui Xu. Distributed and Robust Support Vector Machine. In 27th International Symposium on Algorithms and Computation (ISAAC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 64, pp. 54:1-54:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.ISAAC.2016.54,
  author =	{Liu, Yangwei and Ding, Hu and Huang, Ziyun and Xu, Jinhui},
  title =	{{Distributed and Robust Support Vector Machine}},
  booktitle =	{27th International Symposium on Algorithms and Computation (ISAAC 2016)},
  pages =	{54:1--54:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-026-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{64},
  editor =	{Hong, Seok-Hee},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2016.54},
  URN =		{urn:nbn:de:0030-drops-68221},
  doi =		{10.4230/LIPIcs.ISAAC.2016.54},
  annote =	{Keywords: Distributed Algorithm, Communication Complexity, Robust Algorithm, SVM}
}
Document
Sublinear Geometric Algorithms

Authors: Bernard Chazelle, Ding Liu, and Avner Magen

Published in: Dagstuhl Seminar Proceedings, Volume 5291, Sublinear Algorithms (2006)


Abstract
We present sublinear algorithms to such problems as Detecting of Polytope intersection, Shortest Path on 3D convex Polytopes and volume approximation.

Cite as

Bernard Chazelle, Ding Liu, and Avner Magen. Sublinear Geometric Algorithms. In Sublinear Algorithms. Dagstuhl Seminar Proceedings, Volume 5291, pp. 1-18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)


Copy BibTex To Clipboard

@InProceedings{chazelle_et_al:DagSemProc.05291.4,
  author =	{Chazelle, Bernard and Liu, Ding and Magen, Avner},
  title =	{{Sublinear Geometric Algorithms}},
  booktitle =	{Sublinear Algorithms},
  pages =	{1--18},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{5291},
  editor =	{Artur Czumaj and S. Muthu Muthukrishnan and Ronitt Rubinfeld and Christian Sohler},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.05291.4},
  URN =		{urn:nbn:de:0030-drops-5548},
  doi =		{10.4230/DagSemProc.05291.4},
  annote =	{Keywords: Sublinear algorithms, computational geometry}
}
  • Refine by Author
  • 2 Ding, Hu
  • 1 Chazelle, Bernard
  • 1 Ding, Xiangyun
  • 1 Dong, Xiaojun
  • 1 Gu, Yan
  • Show More...

  • Refine by Classification
  • 1 Theory of computation → Computational geometry
  • 1 Theory of computation → Parallel algorithms

  • Refine by Keyword
  • 1 Communication Complexity
  • 1 Distributed Algorithm
  • 1 Dynamic Programming
  • 1 Edit Distance
  • 1 Parallel Algorithms
  • Show More...

  • Refine by Type
  • 4 document

  • Refine by Publication Year
  • 1 2006
  • 1 2016
  • 1 2018
  • 1 2023

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail