4 Search Results for "Li, Yan"


Document
Efficient Parallel Output-Sensitive Edit Distance

Authors: Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
In this paper, we study efficient parallel edit distance algorithms, both in theory and in practice. Given two strings A[1..n] and B[1..m], and a set of operations allowed to edit the strings, the edit distance between A and B is the minimum number of operations required to transform A into B. In this paper, we use edit distance to refer to the Levenshtein distance, which allows for unit-cost single-character edits (insertions, deletions, substitutions). Sequentially, a standard Dynamic Programming (DP) algorithm solves edit distance with Θ(nm) cost. In many real-world applications, the strings to be compared are similar to each other and have small edit distances. To achieve highly practical implementations, we focus on output-sensitive parallel edit-distance algorithms, i.e., to achieve asymptotically better cost bounds than the standard Θ(nm) algorithm when the edit distance is small. We study four algorithms in the paper, including three algorithms based on Breadth-First Search (BFS), and one algorithm based on Divide-and-Conquer (DaC). Our BFS-based solution is based on the Landau-Vishkin algorithm. We implement three different data structures for the longest common prefix (LCP) queries needed in the algorithm: the classic solution using parallel suffix array, and two hash-based solutions proposed in this paper. Our DaC-based solution is inspired by the output-insensitive solution proposed by Apostolico et al., and we propose a non-trivial adaption to make it output-sensitive. All of the algorithms studied in this paper have good theoretical guarantees, and they achieve different tradeoffs between work (total number of operations), span (longest dependence chain in the computation), and space. We test and compare our algorithms on both synthetic data and real-world data, including DNA sequences, Wikipedia texts, GitHub repositories, etc. Our BFS-based algorithms outperform the existing parallel edit-distance implementation in ParlayLib in all test cases. On cases with fewer than 10⁵ edits, our algorithm can process input sequences of size 10⁹ in about ten seconds, while ParlayLib can only process sequences of sizes up to 10⁶ in the same amount of time. By comparing our algorithms, we also provide a better understanding of the choice of algorithms for different input patterns. We believe that our paper is the first systematic study in the theory and practice of parallel edit distance.

Cite as

Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun. Efficient Parallel Output-Sensitive Edit Distance. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 40:1-40:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ESA.2023.40,
  author =	{Ding, Xiangyun and Dong, Xiaojun and Gu, Yan and Liu, Youzhe and Sun, Yihan},
  title =	{{Efficient Parallel Output-Sensitive Edit Distance}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{40:1--40:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.40},
  URN =		{urn:nbn:de:0030-drops-186935},
  doi =		{10.4230/LIPIcs.ESA.2023.40},
  annote =	{Keywords: Edit Distance, Parallel Algorithms, String Algorithms, Dynamic Programming, Pattern Matching}
}
Document
Local Co-location Pattern Detection: A Summary of Results

Authors: Yan Li and Shashi Shekhar

Published in: LIPIcs, Volume 114, 10th International Conference on Geographic Information Science (GIScience 2018)


Abstract
Given a set of spatial objects of different features (e.g., mall, hospital) and a spatial relation (e.g., geographic proximity), the problem of local co-location pattern detection (LCPD) pairs co-location patterns and localities such that the co-location patterns tend to exist inside the paired localities. A co-location pattern is a set of spatial features, the objects of which are often related to each other. Local co-location patterns are common in many fields, such as public security, and public health. For example, assault crimes and drunk driving events co-locate near bars. The problem is computationally challenging because of the exponential number of potential co-location patterns and candidate localities. The related work applies data-unaware or clustering heuristics to partition the study area, which results in incomplete enumeration of possible localities. In this study, we formally defined the LCPD problem where the candidate locality was defined using minimum orthogonal bounding rectangles (MOBRs). Then, we proposed a Quadruplet & Grid Filter-Refine (QGFR) algorithm that leveraged an MOBR enumeration lemma, and a novel upper bound on the participation index to efficiently prune the search space. The experimental evaluation showed that the QGFR algorithm reduced the computation cost substantially. One case study using the North American Atlas-Hydrography and U.S. Major City Datasets was conducted to discover local co-location patterns which would be missed if the entire dataset was analyzed or methods proposed by the related work were applied.

Cite as

Yan Li and Shashi Shekhar. Local Co-location Pattern Detection: A Summary of Results. In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 10:1-10:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.GISCIENCE.2018.10,
  author =	{Li, Yan and Shekhar, Shashi},
  title =	{{Local Co-location Pattern Detection: A Summary of Results}},
  booktitle =	{10th International Conference on Geographic Information Science (GIScience 2018)},
  pages =	{10:1--10:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-083-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{114},
  editor =	{Winter, Stephan and Griffin, Amy and Sester, Monika},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.GISCIENCE.2018.10},
  URN =		{urn:nbn:de:0030-drops-93387},
  doi =		{10.4230/LIPIcs.GISCIENCE.2018.10},
  annote =	{Keywords: Co-location pattern, Participation index, Spatial heterogeneity}
}
Document
Short Paper
Estimating Building Age from Google Street View Images Using Deep Learning (Short Paper)

Authors: Yan Li, Yiqun Chen, Abbas Rajabifard, Kourosh Khoshelham, and Mitko Aleksandrov

Published in: LIPIcs, Volume 114, 10th International Conference on Geographic Information Science (GIScience 2018)


Abstract
Building databases are a fundamental component of urban analysis. However such databases usually lack detailed attributes such as building age. With a large volume of building images being accessible online via API (such as Google Street View), as well as the fast development of image processing techniques such as deep learning, it becomes feasible to extract information from images to enrich building databases. This paper proposes a novel method to estimate building age based on the convolutional neural network for image features extraction and support vector machine for construction year regression. The contributions of this paper are two-fold: First, to our knowledge, this is the first attempt for estimating building age from images by using deep learning techniques. It provides new insight for planners to apply image processing and deep learning techniques for building database enrichment. Second, an image-base building age estimation framework is proposed which doesn't require information on building height, floor area, construction materials and therefore makes the analysis process simpler and more efficient.

Cite as

Yan Li, Yiqun Chen, Abbas Rajabifard, Kourosh Khoshelham, and Mitko Aleksandrov. Estimating Building Age from Google Street View Images Using Deep Learning (Short Paper). In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 40:1-40:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.GISCIENCE.2018.40,
  author =	{Li, Yan and Chen, Yiqun and Rajabifard, Abbas and Khoshelham, Kourosh and Aleksandrov, Mitko},
  title =	{{Estimating Building Age from Google Street View Images Using Deep Learning}},
  booktitle =	{10th International Conference on Geographic Information Science (GIScience 2018)},
  pages =	{40:1--40:7},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-083-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{114},
  editor =	{Winter, Stephan and Griffin, Amy and Sester, Monika},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.GISCIENCE.2018.40},
  URN =		{urn:nbn:de:0030-drops-93682},
  doi =		{10.4230/LIPIcs.GISCIENCE.2018.40},
  annote =	{Keywords: Building database, deep learning, CNN, SVM, Google Street View}
}
Document
Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens

Authors: Yan Chen, Maxwell Harper, Joseph Konstan, and Sherry Li

Published in: Dagstuhl Seminar Proceedings, Volume 7271, Computational Social Systems and the Internet (2007)


Abstract
We explore the use of social comparison theory as a natural mechanism to increase contributions to an online movie recommendation community by investigating the effects of social information on user behavior in an online field experiment. We find that, after receiving behavioral information about the median user's total number of movie ratings, users below the median demonstrate a 530% increase in the number of monthly movie ratings, while those above the median decrease their monthly ratings by 62%. Movements from both ends converge towards the median, indicating conformity towards a newly-established social norm in a community where such a norm had been absent. Furthermore, the social information has a more dramatic effect on those below the median, suggesting an interaction between conformity and competitive preferences. When given outcome information about the average user's net benefit score from the system, consistent with social preference theory, users with net benefit scores above average contribute 94% of the new updates in the database. In both treatments, we find a highly significant Red Queen Effect.

Cite as

Yan Chen, Maxwell Harper, Joseph Konstan, and Sherry Li. Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens. In Computational Social Systems and the Internet. Dagstuhl Seminar Proceedings, Volume 7271, pp. 1-7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)


Copy BibTex To Clipboard

@InProceedings{chen_et_al:DagSemProc.07271.14,
  author =	{Chen, Yan and Harper, Maxwell and Konstan, Joseph and Li, Sherry},
  title =	{{Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens}},
  booktitle =	{Computational Social Systems and the Internet},
  pages =	{1--7},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7271},
  editor =	{Peter Cramton and Rudolf M\"{u}ller and Eva Tardos and Moshe Tennenholtz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.07271.14},
  URN =		{urn:nbn:de:0030-drops-11550},
  doi =		{10.4230/DagSemProc.07271.14},
  annote =	{Keywords: Social comparison, conformity, public goods, embedded online field experiment}
}
  • Refine by Author
  • 2 Li, Yan
  • 1 Aleksandrov, Mitko
  • 1 Chen, Yan
  • 1 Chen, Yiqun
  • 1 Ding, Xiangyun
  • Show More...

  • Refine by Classification
  • 1 Computing methodologies → Supervised learning by regression
  • 1 Information systems → Data mining
  • 1 Information systems → Geographic information systems
  • 1 Theory of computation → Parallel algorithms

  • Refine by Keyword
  • 1 Building database
  • 1 CNN
  • 1 Co-location pattern
  • 1 Dynamic Programming
  • 1 Edit Distance
  • Show More...

  • Refine by Type
  • 4 document

  • Refine by Publication Year
  • 2 2018
  • 1 2007
  • 1 2023

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail