6 Search Results for "Gu, Yan"


Document
Efficient Parallel Output-Sensitive Edit Distance

Authors: Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
In this paper, we study efficient parallel edit distance algorithms, both in theory and in practice. Given two strings A[1..n] and B[1..m], and a set of operations allowed to edit the strings, the edit distance between A and B is the minimum number of operations required to transform A into B. In this paper, we use edit distance to refer to the Levenshtein distance, which allows for unit-cost single-character edits (insertions, deletions, substitutions). Sequentially, a standard Dynamic Programming (DP) algorithm solves edit distance with Θ(nm) cost. In many real-world applications, the strings to be compared are similar to each other and have small edit distances. To achieve highly practical implementations, we focus on output-sensitive parallel edit-distance algorithms, i.e., to achieve asymptotically better cost bounds than the standard Θ(nm) algorithm when the edit distance is small. We study four algorithms in the paper, including three algorithms based on Breadth-First Search (BFS), and one algorithm based on Divide-and-Conquer (DaC). Our BFS-based solution is based on the Landau-Vishkin algorithm. We implement three different data structures for the longest common prefix (LCP) queries needed in the algorithm: the classic solution using parallel suffix array, and two hash-based solutions proposed in this paper. Our DaC-based solution is inspired by the output-insensitive solution proposed by Apostolico et al., and we propose a non-trivial adaption to make it output-sensitive. All of the algorithms studied in this paper have good theoretical guarantees, and they achieve different tradeoffs between work (total number of operations), span (longest dependence chain in the computation), and space. We test and compare our algorithms on both synthetic data and real-world data, including DNA sequences, Wikipedia texts, GitHub repositories, etc. Our BFS-based algorithms outperform the existing parallel edit-distance implementation in ParlayLib in all test cases. On cases with fewer than 10⁵ edits, our algorithm can process input sequences of size 10⁹ in about ten seconds, while ParlayLib can only process sequences of sizes up to 10⁶ in the same amount of time. By comparing our algorithms, we also provide a better understanding of the choice of algorithms for different input patterns. We believe that our paper is the first systematic study in the theory and practice of parallel edit distance.

Cite as

Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun. Efficient Parallel Output-Sensitive Edit Distance. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 40:1-40:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ESA.2023.40,
  author =	{Ding, Xiangyun and Dong, Xiaojun and Gu, Yan and Liu, Youzhe and Sun, Yihan},
  title =	{{Efficient Parallel Output-Sensitive Edit Distance}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{40:1--40:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.40},
  URN =		{urn:nbn:de:0030-drops-186935},
  doi =		{10.4230/LIPIcs.ESA.2023.40},
  annote =	{Keywords: Edit Distance, Parallel Algorithms, String Algorithms, Dynamic Programming, Pattern Matching}
}
Document
ParGeo: A Library for Parallel Computational Geometry

Authors: Yiqiu Wang, Rahul Yesantharao, Shangdi Yu, Laxman Dhulipala, Yan Gu, and Julian Shun

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)


Abstract
This paper presents ParGeo, a multicore library for computational geometry. ParGeo contains modules for fundamental tasks including kd-tree based spatial search, spatial graph generation, and algorithms in computational geometry. We focus on three new algorithmic contributions provided in the library. First, we present a new parallel convex hull algorithm based on a reservation technique to enable parallel modifications to the hull. We also provide the first parallel implementations of the randomized incremental convex hull algorithm as well as a divide-and-conquer convex hull algorithm in ℝ³. Second, for the smallest enclosing ball problem, we propose a new sampling-based algorithm to quickly reduce the size of the data set. We also provide the first parallel implementation of Welzl’s classic algorithm for smallest enclosing ball. Third, we present the BDL-tree, a parallel batch-dynamic kd-tree that allows for efficient parallel updates and k-NN queries over dynamically changing point sets. BDL-trees consist of a log-structured set of kd-trees which can be used to efficiently insert, delete, and query batches of points in parallel. On 36 cores with two-way hyper-threading, our fastest convex hull algorithm achieves up to 44.7x self-relative parallel speedup and up to 559x speedup against the best existing sequential implementation. Our smallest enclosing ball algorithm using our sampling-based algorithm achieves up to 27.1x self-relative parallel speedup and up to 178x speedup against the best existing sequential implementation. Our implementation of the BDL-tree achieves self-relative parallel speedup of up to 46.1x. Across all of the algorithms in ParGeo, we achieve self-relative parallel speedup of 8.1-46.61x.

Cite as

Yiqiu Wang, Rahul Yesantharao, Shangdi Yu, Laxman Dhulipala, Yan Gu, and Julian Shun. ParGeo: A Library for Parallel Computational Geometry. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 88:1-88:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{wang_et_al:LIPIcs.ESA.2022.88,
  author =	{Wang, Yiqiu and Yesantharao, Rahul and Yu, Shangdi and Dhulipala, Laxman and Gu, Yan and Shun, Julian},
  title =	{{ParGeo: A Library for Parallel Computational Geometry}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{88:1--88:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.88},
  URN =		{urn:nbn:de:0030-drops-170265},
  doi =		{10.4230/LIPIcs.ESA.2022.88},
  annote =	{Keywords: Computational Geometry, Parallel Algorithms, Libraries}
}
Document
A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem

Authors: Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun

Published in: LIPIcs, Volume 189, 37th International Symposium on Computational Geometry (SoCG 2021)


Abstract
We propose a theoretically-efficient and practical parallel batch-dynamic data structure for the closest pair problem. Our solution is based on a serial dynamic closest pair data structure by Golin et al., and supports batches of insertions and deletions in parallel. For a data set of size n, our data structure supports a batch of insertions or deletions of size m in O(m(1+log ((n+m)/m))) expected work and O(log (n+m)log^*(n+m)) depth with high probability, and takes linear space. The key techniques for achieving these bounds are a new work-efficient parallel batch-dynamic binary heap, and careful management of the computation across sets of points to minimize work and depth. We provide an optimized multicore implementation of our data structure using dynamic hash tables, parallel heaps, and dynamic k-d trees. Our experiments on a variety of synthetic and real-world data sets show that it achieves a parallel speedup of up to 38.57x (15.10x on average) on 48 cores with hyper-threading. In addition, we also implement and compare four parallel algorithms for static closest pair problem, for which we are not aware of any existing practical implementations. On 48 cores with hyper-threading, the static algorithms achieve up to 51.45x (29.42x on average) speedup, and Rabin’s algorithm performs the best on average. Comparing our dynamic algorithm to the fastest static algorithm, we find that it is advantageous to use the dynamic algorithm for batch sizes of up to 20% of the data set. As far as we know, our work is the first to experimentally evaluate parallel closest pair algorithms, in both the static and the dynamic settings.

Cite as

Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun. A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 60:1-60:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{wang_et_al:LIPIcs.SoCG.2021.60,
  author =	{Wang, Yiqiu and Yu, Shangdi and Gu, Yan and Shun, Julian},
  title =	{{A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem}},
  booktitle =	{37th International Symposium on Computational Geometry (SoCG 2021)},
  pages =	{60:1--60:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-184-9},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{189},
  editor =	{Buchin, Kevin and Colin de Verdi\`{e}re, \'{E}ric},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2021.60},
  URN =		{urn:nbn:de:0030-drops-138594},
  doi =		{10.4230/LIPIcs.SoCG.2021.60},
  annote =	{Keywords: Closest Pair, Parallel Algorithms, Dynamic Algorithms, Experimental Algorithms}
}
Document
Algorithmic Building Blocks for Asymmetric Memories

Authors: Yan Gu, Yihan Sun, and Guy E. Blelloch

Published in: LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)


Abstract
The future of main memory appears to lie in the direction of new non-volatile memory technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of energy, bandwidth, and latency. This asymmetry can have a significant effect on algorithm design, and in many cases it is possible to reduce writes at the cost of more reads. This paper studies which algorithmic techniques are useful in designing practical write-efficient algorithms. We focus on several fundamental algorithmic building blocks including unordered set/map implemented using hash tables, comparison sort, and graph traversal algorithms including breadth-first search and Dijkstra's algorithm. We introduce new algorithms and implementations that can reduce writes, and analyze the performance experimentally using a software simulator. Finally, we summarize interesting lessons and directions in designing write-efficient algorithms that can be valuable to share.

Cite as

Yan Gu, Yihan Sun, and Guy E. Blelloch. Algorithmic Building Blocks for Asymmetric Memories. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 44:1-44:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{gu_et_al:LIPIcs.ESA.2018.44,
  author =	{Gu, Yan and Sun, Yihan and Blelloch, Guy E.},
  title =	{{Algorithmic Building Blocks for Asymmetric Memories}},
  booktitle =	{26th Annual European Symposium on Algorithms (ESA 2018)},
  pages =	{44:1--44:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-081-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{112},
  editor =	{Azar, Yossi and Bast, Hannah and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.44},
  URN =		{urn:nbn:de:0030-drops-95070},
  doi =		{10.4230/LIPIcs.ESA.2018.44},
  annote =	{Keywords: Asymmetric Memory, I/O Cost, Write-Efficient Algorithms, Hash Tables, Graph-Traversal Algorithms}
}
Document
Efficient Construction of Probabilistic Tree Embeddings

Authors: Guy E. Blelloch, Yan Gu, and Yihan Sun

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)


Abstract
In this paper we describe an algorithm that embeds a graph metric (V,d_G) on an undirected weighted graph G=(V,E) into a distribution of tree metrics (T,D_T) such that for every pair u,v in V, d_G(u,v)<=d_T(u,v) and E_T[d_T(u,v)]<=O(log n)d_G(u,v). Such embeddings have proved highly useful in designing fast approximation algorithms, as many hard problems on graphs are easy to solve on tree instances. For a graph with n vertices and m edges, our algorithm runs in O(m log n) time with high probability, which improves the previous upper bound of O(m log^3 n) shown by Mendel et al. in 2009. The key component of our algorithm is a new approximate single-source shortest-path algorithm, which implements the priority queue with a new data structure, the bucket-tree structure. The algorithm has three properties: it only requires linear time in terms of the number of edges in the input graph; the computed distances have the distance preserving property; and when computing the shortest-paths to the k-nearest vertices from the source, it only requires to visit these vertices and their edge lists. These properties are essential to guarantee the correctness and the stated work bound. Using this shortest-path algorithm, we show how to generate an intermediate structure, the approximate dominance sequences of the input graph, in O(m log n) time, and further propose a simple yet efficient algorithm to converted this sequence to a tree embedding in O(n log n) time, both with high probability. Combining the three subroutines gives the stated work bound of the algorithm. We also show a new application of probabilistic tree embeddings: they can be used to accelerate the construction of a series of approximate distance oracles.

Cite as

Guy E. Blelloch, Yan Gu, and Yihan Sun. Efficient Construction of Probabilistic Tree Embeddings. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 26:1-26:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{blelloch_et_al:LIPIcs.ICALP.2017.26,
  author =	{Blelloch, Guy E. and Gu, Yan and Sun, Yihan},
  title =	{{Efficient Construction of Probabilistic Tree Embeddings}},
  booktitle =	{44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)},
  pages =	{26:1--26:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-041-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{80},
  editor =	{Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.26},
  URN =		{urn:nbn:de:0030-drops-75034},
  doi =		{10.4230/LIPIcs.ICALP.2017.26},
  annote =	{Keywords: Graph Algorithm, Metric Embeddings, Probabilistic Tree Embeddings, Single-source Shortest-paths}
}
Document
Efficient Algorithms with Asymmetric Read and Write Costs

Authors: Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, and Julian Shun

Published in: LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)


Abstract
In several emerging technologies for computer memory (main memory), the cost of reading is significantly cheaper than the cost of writing. Such asymmetry in memory costs poses a fundamentally different model from the RAM for algorithm design. In this paper we study lower and upper bounds for various problems under such asymmetric read and write costs. We consider both the case in which all but O(1) memory has asymmetric cost, and the case of a small cache of symmetric memory. We model both cases using the (M,omega)-ARAM, in which there is a small (symmetric) memory of size M and a large unbounded (asymmetric) memory, both random access, and where reading from the large memory has unit cost, but writing has cost omega >> 1. For FFT and sorting networks we show a lower bound cost of Omega(omega*n*log_{omega*M}(n)), which indicates that it is not possible to achieve asymptotic improvements with cheaper reads when omega is bounded by a polynomial in M. Moreover, there is an asymptotic gap (of min(omega,log(n)/log(omega*M)) between the cost of sorting networks and comparison sorting in the model. This contrasts with the RAM, and most other models, in which the asymptotic costs are the same. We also show a lower bound for computations on an n*n diamond DAG of Omega(omega*n^2/M) cost, which indicates no asymptotic improvement is achievable with fast reads. However, we show that for the minimum edit distance problem (and related problems), which would seem to be a diamond DAG, we can beat this lower bound with an algorithm with only O(omega*n^2/(M*min(omega^{1/3},M^{1/2}))) cost. To achieve this we make use of a "path sketch" technique that is forbidden in a strict DAG computation. Finally, we show several interesting upper bounds for shortest path problems, minimum spanning trees, and other problems. A common theme in many of the upper bounds is that they require redundant computation and a tradeoff between reads and writes.

Cite as

Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, and Julian Shun. Efficient Algorithms with Asymmetric Read and Write Costs. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{blelloch_et_al:LIPIcs.ESA.2016.14,
  author =	{Blelloch, Guy E. and Fineman, Jeremy T. and Gibbons, Phillip B. and Gu, Yan and Shun, Julian},
  title =	{{Efficient Algorithms with Asymmetric Read and Write Costs}},
  booktitle =	{24th Annual European Symposium on Algorithms (ESA 2016)},
  pages =	{14:1--14:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-015-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{57},
  editor =	{Sankowski, Piotr and Zaroliagis, Christos},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.14},
  URN =		{urn:nbn:de:0030-drops-63656},
  doi =		{10.4230/LIPIcs.ESA.2016.14},
  annote =	{Keywords: Computational Model, Lower Bounds, Shortest-paths, Non-Volatile Memory, Sorting Networks, Fast Fourier Transform, Diamond DAG, Minimum Spanning Tree}
}
  • Refine by Author
  • 6 Gu, Yan
  • 3 Blelloch, Guy E.
  • 3 Shun, Julian
  • 3 Sun, Yihan
  • 2 Wang, Yiqiu
  • Show More...

  • Refine by Classification
  • 2 Computing methodologies → Shared memory algorithms
  • 1 Theory of computation → Computational geometry
  • 1 Theory of computation → Design and analysis of algorithms
  • 1 Theory of computation → Models of computation
  • 1 Theory of computation → Parallel algorithms
  • Show More...

  • Refine by Keyword
  • 3 Parallel Algorithms
  • 1 Asymmetric Memory
  • 1 Closest Pair
  • 1 Computational Geometry
  • 1 Computational Model
  • Show More...

  • Refine by Type
  • 6 document

  • Refine by Publication Year
  • 1 2016
  • 1 2017
  • 1 2018
  • 1 2021
  • 1 2022
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail