DROPS

Document

Survey

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.36

Faster Fréchet Distance Under Transformations

Authors: Kevin Buchin, Maike Buchin, Zijin Huang, André Nusser, and Sampson Wong

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

We study the problem of computing the Fréchet distance between two polygonal curves under transformations. First, we consider translations in the Euclidean plane. Given two curves π and σ of total complexity n and a threshold δ ≥ 0, we present an 𝒪̃(n^{7 + 1/3}) time algorithm to determine whether there exists a translation t ∈ ℝ² such that the Fréchet distance between π and σ + t is at most δ. This improves on the previous best result, which is an 𝒪(n⁸) time algorithm. We then generalize this result to any class of rationally parameterized transformations, which includes translation, rotation, scaling, and arbitrary affine transformations. For a class T of rationally parametrized transformations with k degrees of freedom, we show that one can determine whether there is a transformation τ ∈ T such that the Fréchet distance between π and τ(σ) is at most δ in 𝒪̃(n^{3k+4/3}) time.

Cite as

Kevin Buchin, Maike Buchin, Zijin Huang, André Nusser, and Sampson Wong. Faster Fréchet Distance Under Transformations. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 36:1-36:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{buchin_et_al:LIPIcs.ICALP.2025.36,
  author =	{Buchin, Kevin and Buchin, Maike and Huang, Zijin and Nusser, Andr\'{e} and Wong, Sampson},
  title =	{{Faster Fr\'{e}chet Distance Under Transformations}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{36:1--36:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.36},
  URN =		{urn:nbn:de:0030-drops-234137},
  doi =		{10.4230/LIPIcs.ICALP.2025.36},
  annote =	{Keywords: Fr\'{e}chet distance, curve similarity, shape matching}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.42

Fitting Tree Metrics and Ultrametrics in Data Streams

Authors: Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

Fitting distances to tree metrics and ultrametrics are two widely used methods in hierarchical clustering, primarily explored within the context of numerical taxonomy. Formally, given a positive distance function D: binom(V,2) → ℝ_{>0}, the goal is to find a tree (or an ultrametric) T including all elements of set V, such that the difference between the distances among vertices in T and those specified by D is minimized. Numerical taxonomy was first introduced by Sneath and Sokal [Nature 1962], and since then it has been studied extensively in both biology and computer science. In this paper, we initiate the study of ultrametric and tree metric fitting problems in the semi-streaming model, where the distances between pairs of elements from V (with |V| = n), defined by the function D, can arrive in an arbitrary order. We study these problems under various distance norms; namely the 𝓁₀ objective, which aims to minimize the number of modified entries in D to fit a tree-metric or an ultrametric; the 𝓁₁ objective, which seeks to minimize the total sum of distance errors across all pairs of points in V; and the 𝓁_∞ objective, which focuses on minimizing the maximum error incurred by any entries in D. - Our first result addresses the 𝓁₀ objective. We provide a single-pass polynomial-time Õ(n)-space O(1) approximation algorithm for ultrametrics and prove that no single-pass exact algorithm exists, even with exponential time. - Next, we show that the algorithm for 𝓁₀ implies an O(Δ/δ) approximation for the 𝓁₁ objective, where Δ is the maximum, and δ is the minimum absolute difference between distances in the input. This bound matches the best-known approximation for the RAM model using a combinatorial algorithm when Δ/δ = O(n). - For the 𝓁_∞ objective, we provide a complete characterization of the ultrametric fitting problem. First, we present a single-pass polynomial-time Õ(n)-space 2-approximation algorithm and show that no better than 2-approximation is possible, even with exponential time. Furthermore, we show that with an additional pass, it is possible to achieve a polynomial-time exact algorithm for ultrametrics. - Finally, we extend all these results to tree metrics by using only one additional pass through the stream and without asymptotically increasing the approximation factor.

Cite as

Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis. Fitting Tree Metrics and Ultrametrics in Data Streams. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 42:1-42:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{carmel_et_al:LIPIcs.ICALP.2025.42,
  author =	{Carmel, Amir and Das, Debarati and Kipouridis, Evangelos and Pipis, Evangelos},
  title =	{{Fitting Tree Metrics and Ultrametrics in Data Streams}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{42:1--42:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.42},
  URN =		{urn:nbn:de:0030-drops-234197},
  doi =		{10.4230/LIPIcs.ICALP.2025.42},
  annote =	{Keywords: Streaming, Clustering, Ultrametrics, Tree metrics, Distance fitting}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2025.15

When Distances Lie: Euclidean Embeddings in the Presence of Outliers and Distance Violations

Authors: Matthias Bentert, Fedor V. Fomin, Petr A. Golovach, M. S. Ramanujan, and Saket Saurabh

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)

Abstract

Distance geometry explores the properties of distance spaces that can be exactly represented as the pairwise Euclidean distances between points in ℝ^d (d ≥ 1), or equivalently, distance spaces that can be isometrically embedded in ℝ^d. In this work, we investigate whether a distance space can be isometrically embedded in ℝ^d after applying a limited number of modifications. Specifically, we focus on two types of modifications: outlier deletion (removing points) and distance modification (adjusting distances between points). The central problem, Euclidean Embedding Editing, asks whether an input distance space on n points can be transformed, using at most k modifications, into a space that is isometrically embeddable in ℝ^d. We present several fixed-parameter tractable (FPT) and approximation algorithms for this problem. Our first result is an algorithm that solves Euclidean Embedding Editing in time (dk)^𝒪(d+k) + n^𝒪(1). The core subroutine of this algorithm, which is of independent interest, is a polynomial-time method for compressing the input distance space into an equivalent instance of Euclidean Embedding Editing with 𝒪((dk)²) points. For the special but important case of Euclidean Embedding Editing where only outlier deletions are allowed, we improve the parameter dependence of the FPT algorithm and obtain a running time of min{(d+3)^k, 2^{d+k}} ⋅ n^𝒪(1). Additionally, we provide an FPT-approximation algorithm for this problem, which outputs a set of at most 2 ⋅ Opt outliers in time 2^d ⋅ n^{𝒪(1)}. This 2-approximation algorithm improves upon the previous (3+ε)-approximation algorithm by Sidiropoulos, Wang, and Wang [SODA '17]. Furthermore, we complement our algorithms with hardness results motivating our choice of parameterizations.

Cite as

Matthias Bentert, Fedor V. Fomin, Petr A. Golovach, M. S. Ramanujan, and Saket Saurabh. When Distances Lie: Euclidean Embeddings in the Presence of Outliers and Distance Violations. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bentert_et_al:LIPIcs.SoCG.2025.15,
  author =	{Bentert, Matthias and Fomin, Fedor V. and Golovach, Petr A. and Ramanujan, M. S. and Saurabh, Saket},
  title =	{{When Distances Lie: Euclidean Embeddings in the Presence of Outliers and Distance Violations}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{15:1--15:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.15},
  URN =		{urn:nbn:de:0030-drops-231672},
  doi =		{10.4230/LIPIcs.SoCG.2025.15},
  annote =	{Keywords: Parameterized Complexity, Euclidean Embedding, FPT-approximation}
}

Document

DOI: 10.4230/LIPIcs.FORC.2025.6

Near-Universally-Optimal Differentially Private Minimum Spanning Trees

Authors: Richard Hladík and Jakub Tětek

Published in: LIPIcs, Volume 329, 6th Symposium on Foundations of Responsible Computing (FORC 2025)

Abstract

Devising mechanisms with good beyond-worst-case input-dependent performance has been an important focus of differential privacy, with techniques such as smooth sensitivity, propose-test-release, or inverse sensitivity mechanism being developed to achieve this goal. This makes it very natural to use the notion of universal optimality in differential privacy. Universal optimality is a strong instance-specific optimality guarantee for problems on weighted graphs, which roughly states that for any fixed underlying (unweighted) graph, the algorithm is optimal in the worst-case sense, with respect to the possible setting of the edge weights. In this paper, we give the first such result in differential privacy. Namely, we prove that a simple differentially private mechanism for approximately releasing the minimum spanning tree is near-optimal in the sense of universal optimality for the 𝓁₁ neighbor relation. Previously, it was only known that this mechanism is nearly optimal in the worst case. We then focus on the 𝓁_∞ neighbor relation, for which the described mechanism is not optimal. We show that one may implement the exponential mechanism for MST in polynomial time, and that this results in universal near-optimality for both the 𝓁₁ and the 𝓁_∞ neighbor relations.

Cite as

Richard Hladík and Jakub Tětek. Near-Universally-Optimal Differentially Private Minimum Spanning Trees. In 6th Symposium on Foundations of Responsible Computing (FORC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 329, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hladik_et_al:LIPIcs.FORC.2025.6,
  author =	{Hlad{\'\i}k, Richard and T\v{e}tek, Jakub},
  title =	{{Near-Universally-Optimal Differentially Private Minimum Spanning Trees}},
  booktitle =	{6th Symposium on Foundations of Responsible Computing (FORC 2025)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-367-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{329},
  editor =	{Bun, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2025.6},
  URN =		{urn:nbn:de:0030-drops-231337},
  doi =		{10.4230/LIPIcs.FORC.2025.6},
  annote =	{Keywords: differential privacy, universal optimality, minimum spanning trees}
}

Document

DOI: 10.4230/LIPIcs.STACS.2025.32

A Faster Algorithm for Constrained Correlation Clustering

Authors: Nick Fischer, Evangelos Kipouridis, Jonas Klausen, and Mikkel Thorup

Published in: LIPIcs, Volume 327, 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)

Abstract

In the Correlation Clustering problem we are given n nodes, and a preference for each pair of nodes indicating whether we prefer the two endpoints to be in the same cluster or not. The output is a clustering inducing the minimum number of violated preferences. In certain cases, however, the preference between some pairs may be too important to be violated. The constrained version of this problem specifies pairs of nodes that must be in the same cluster as well as pairs that must not be in the same cluster (hard constraints). The output clustering has to satisfy all hard constraints while minimizing the number of violated preferences. Constrained Correlation Clustering is APX-Hard and has been approximated within a factor 3 by van Zuylen et al. [SODA '07]. Their algorithm is based on rounding an LP with Θ(n³) constraints, resulting in an Ω(n^{3ω}) running time. In this work, using a more combinatorial approach, we show how to approximate this problem significantly faster at the cost of a slightly weaker approximation factor. In particular, our algorithm runs in Õ(n³) time (notice that the input size is Θ(n²)) and approximates Constrained Correlation Clustering within a factor 16. To achieve our result we need properties guaranteed by a particular influential algorithm for (unconstrained) Correlation Clustering, the CC-PIVOT algorithm. This algorithm chooses a pivot node u, creates a cluster containing u and all its preferred nodes, and recursively solves the rest of the problem. It is known that selecting pivots at random gives a 3-approximation. As a byproduct of our work, we provide a derandomization of the CC-PIVOT algorithm that still achieves the 3-approximation; furthermore, we show that there exist instances where no ordering of the pivots can give a (3-ε)-approximation, for any constant ε. Finally, we introduce a node-weighted version of Correlation Clustering, which can be approximated within factor 3 using our insights on Constrained Correlation Clustering. As the general weighted version of Correlation Clustering would require a major breakthrough to approximate within a factor o(log n), Node-Weighted Correlation Clustering may be a practical alternative.

Cite as

Nick Fischer, Evangelos Kipouridis, Jonas Klausen, and Mikkel Thorup. A Faster Algorithm for Constrained Correlation Clustering. In 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 327, pp. 32:1-32:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{fischer_et_al:LIPIcs.STACS.2025.32,
  author =	{Fischer, Nick and Kipouridis, Evangelos and Klausen, Jonas and Thorup, Mikkel},
  title =	{{A Faster Algorithm for Constrained Correlation Clustering}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{32:1--32:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.32},
  URN =		{urn:nbn:de:0030-drops-228585},
  doi =		{10.4230/LIPIcs.STACS.2025.32},
  annote =	{Keywords: Clustering, Constrained Correlation Clustering, Approximation}
}

Document

DOI: 10.4230/LIPIcs.ESA.2020.45

Linear Expected Complexity for Directional and Multiplicative Voronoi Diagrams

Authors: Chenglin Fan and Benjamin Raichel

Published in: LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)

Abstract

While the standard unweighted Voronoi diagram in the plane has linear worst-case complexity, many of its natural generalizations do not. This paper considers two such previously studied generalizations, namely multiplicative and semi Voronoi diagrams. These diagrams both have quadratic worst-case complexity, though here we show that their expected complexity is linear for certain natural randomized inputs. Specifically, we argue that the expected complexity is linear for: (1) semi Voronoi diagrams when the visible direction is randomly sampled, and (2) for multiplicative diagrams when either weights are sampled from a constant-sized set, or the more challenging case when weights are arbitrary but locations are sampled from a square.

Cite as

Chenglin Fan and Benjamin Raichel. Linear Expected Complexity for Directional and Multiplicative Voronoi Diagrams. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 45:1-45:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.ESA.2020.45,
  author =	{Fan, Chenglin and Raichel, Benjamin},
  title =	{{Linear Expected Complexity for Directional and Multiplicative Voronoi Diagrams}},
  booktitle =	{28th Annual European Symposium on Algorithms (ESA 2020)},
  pages =	{45:1--45:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-162-7},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{173},
  editor =	{Grandoni, Fabrizio and Herman, Grzegorz and Sanders, Peter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2020.45},
  URN =		{urn:nbn:de:0030-drops-129111},
  doi =		{10.4230/LIPIcs.ESA.2020.45},
  annote =	{Keywords: Voronoi Diagrams, Expected Complexity, Computational Geometry}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2020.20

Fréchet Distance for Uncertain Curves

Authors: Kevin Buchin, Chenglin Fan, Maarten Löffler, Aleksandr Popov, Benjamin Raichel, and Marcel Roeloffzen

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Abstract

In this paper we study a wide range of variants for computing the (discrete and continuous) Fréchet distance between uncertain curves. We define an uncertain curve as a sequence of uncertainty regions, where each region is a disk, a line segment, or a set of points. A realisation of a curve is a polyline connecting one point from each region. Given an uncertain curve and a second (certain or uncertain) curve, we seek to compute the lower and upper bound Fréchet distance, which are the minimum and maximum Fréchet distance for any realisations of the curves. We prove that both problems are NP-hard for the continuous Fréchet distance, and the upper bound problem remains hard for the discrete Fréchet distance. In contrast, the lower bound discrete Fréchet distance can be computed in polynomial time using dynamic programming. Furthermore, we show that computing the expected discrete or continuous Fréchet distance is #P-hard when the uncertainty regions are modelled as point sets or line segments. On the positive side, we argue that in any constant dimension there is a FPTAS for the lower bound problem when Δ/δ is polynomially bounded, where δ is the Fréchet distance and Δ bounds the diameter of the regions. We then argue there is a near-linear-time 3-approximation for the decision problem when the regions are convex and roughly δ-separated. Finally, we study the setting with Sakoe - Chiba bands, restricting the alignment of the two curves, and give polynomial-time algorithms for upper bound and expected (discrete) Fréchet distance for point-set-modelled uncertainty regions.

Cite as

Kevin Buchin, Chenglin Fan, Maarten Löffler, Aleksandr Popov, Benjamin Raichel, and Marcel Roeloffzen. Fréchet Distance for Uncertain Curves. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 20:1-20:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{buchin_et_al:LIPIcs.ICALP.2020.20,
  author =	{Buchin, Kevin and Fan, Chenglin and L\"{o}ffler, Maarten and Popov, Aleksandr and Raichel, Benjamin and Roeloffzen, Marcel},
  title =	{{Fr\'{e}chet Distance for Uncertain Curves}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{20:1--20:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.20},
  URN =		{urn:nbn:de:0030-drops-124276},
  doi =		{10.4230/LIPIcs.ICALP.2020.20},
  annote =	{Keywords: Curves, Uncertainty, Fr\'{e}chet Distance, Hardness}
}

Document

DOI: 10.4230/LIPIcs.SWAT.2020.25

Generalized Metric Repair on Graphs

Authors: Chenglin Fan, Anna C. Gilbert, Benjamin Raichel, Rishi Sonthalia, and Gregory Van Buskirk

Published in: LIPIcs, Volume 162, 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)

Abstract

Many modern data analysis algorithms either assume or are considerably more efficient if the distances between the data points satisfy a metric. However, as real data sets are noisy, they often do not possess this fundamental property. For this reason, Gilbert and Jain [A. Gilbert and L. Jain, 2017] and Fan et al. [C. Fan et al., 2018] introduced the closely related sparse metric repair and metric violation distance problems. Given a matrix, representing all distances, the goal is to repair as few entries as possible to ensure they satisfy a metric. This problem was shown to be APX-hard, and an O(OPT^{1/3})-approximation was given, where OPT is the optimal solution size. In this paper, we generalize the problem, by describing distances by a possibly incomplete positively weighted graph, where again our goal is to find the smallest number of weight modifications so that they satisfy a metric. This natural generalization is more flexible as it takes into account different relationships among the data points. We demonstrate the inherent combinatorial structure of the problem, and give an approximation-preserving reduction from MULTICUT, which is hard to approximate within any constant factor assuming UGC. Conversely, we show that for any fixed constant ς, for the large class of ς-chordal graphs, the problem is fixed parameter tractable, answering an open question from previous work. Call a cycle broken if it contains an edge whose weight is larger than the sum of all its other edges, and call the amount of this difference its deficit. We present approximation algorithms, one depending on the maximum number of edges in a broken cycle, and one depending on the number of distinct deficit values, both quantities which may naturally be small. Finally, we give improved analysis of previous algorithms for complete graphs.

Cite as

Chenglin Fan, Anna C. Gilbert, Benjamin Raichel, Rishi Sonthalia, and Gregory Van Buskirk. Generalized Metric Repair on Graphs. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 162, pp. 25:1-25:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.SWAT.2020.25,
  author =	{Fan, Chenglin and Gilbert, Anna C. and Raichel, Benjamin and Sonthalia, Rishi and Van Buskirk, Gregory},
  title =	{{Generalized Metric Repair on Graphs}},
  booktitle =	{17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)},
  pages =	{25:1--25:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-150-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{162},
  editor =	{Albers, Susanne},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2020.25},
  URN =		{urn:nbn:de:0030-drops-122727},
  doi =		{10.4230/LIPIcs.SWAT.2020.25},
  annote =	{Keywords: Approximation, FPT, Hardness, Metric Spaces}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2017.42

Computing the Fréchet Gap Distance

Authors: Chenglin Fan and Benjamin Raichel

Published in: LIPIcs, Volume 77, 33rd International Symposium on Computational Geometry (SoCG 2017)

Abstract

Measuring the similarity of two polygonal curves is a fundamental computational task. Among alternatives, the Frechet distance is one of the most well studied similarity measures. Informally, the Fréchet distance is described as the minimum leash length required for a man on one of the curves to walk a dog on the other curve continuously from the starting to the ending points. In this paper we study a variant called the Fréchet gap distance. In the man and dog analogy, the Fréchet gap distance minimizes the difference of the longest and smallest leash lengths used over the entire walk. This measure in some ways better captures our intuitive notions of curve similarity, for example giving distance zero to translated copies of the same curve. The Fréchet gap distance was originally introduced by Filtser and Katz (2015) in the context of the discrete Fréchet distance. Here we study the continuous version, which presents a number of additional challenges not present in discrete case. In particular, the continuous nature makes bounding and searching over the critical events a rather difficult task. For this problem we give an O(n^5 log(n)) time exact algorithm and a more efficient O(n^2 log(n) + (n^2/epsilon) log(1/epsilon)) time (1+epsilon)-approximation algorithm, where n is the total number of vertices of the input curves. Note that for (small enough) constant epsilon and ignoring logarithmic factors, our approximation has quadratic running time, matching the lower bound, assuming SETH (Bringmann 2014), for approximating the standard Fréchet distance for general curves.

Cite as

Chenglin Fan and Benjamin Raichel. Computing the Fréchet Gap Distance. In 33rd International Symposium on Computational Geometry (SoCG 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 77, pp. 42:1-42:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.SoCG.2017.42,
  author =	{Fan, Chenglin and Raichel, Benjamin},
  title =	{{Computing the Fr\'{e}chet Gap Distance}},
  booktitle =	{33rd International Symposium on Computational Geometry (SoCG 2017)},
  pages =	{42:1--42:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-038-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{77},
  editor =	{Aronov, Boris and Katz, Matthew J.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2017.42},
  URN =		{urn:nbn:de:0030-drops-71849},
  doi =		{10.4230/LIPIcs.SoCG.2017.42},
  annote =	{Keywords: Frechet Distance, Approximation, Polygonal Curves}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2016.37

On the General Chain Pair Simplification Problem

Authors: Chenglin Fan, Omrit Filtser, Matthew J. Katz, and Binhai Zhu

Published in: LIPIcs, Volume 58, 41st International Symposium on Mathematical Foundations of Computer Science (MFCS 2016)

Abstract

The Chain Pair Simplification problem (CPS) was posed by Bereg et al. who were motivated by the problem of efficiently computing and visualizing the structural resemblance between a pair of protein backbones. In this problem, given two polygonal chains of lengths n and m, the goal is to simplify both of them simultaneously, so that the lengths of the resulting simplifications as well as the discrete Frechet distance between them are bounded. When the vertices of the simplifications are arbitrary (i.e., not necessarily from the original chains), the problem is called General CPS (GCPS). In this paper we consider for the first time the complexity of GCPS under both the discrete Frechet distance (GCPS-3F) and the Hausdorff distance (GCPS-2H). (In the former version, the quality of the two simplifications is measured by the discrete Fr'echet distance, and in the latter version it is measured by the Hausdorff distance.) We prove that GCPS-3F is polynomially solvable, by presenting an widetilde-O((n+m)^6 min{n,m}) time algorithm for the corresponding minimization problem. We also present an O((n+m)^4) 2-approximation algorithm for the problem. On the other hand, we show that GCPS-2H is NP-complete, and present an approximation algorithm for the problem.

Cite as

Chenglin Fan, Omrit Filtser, Matthew J. Katz, and Binhai Zhu. On the General Chain Pair Simplification Problem. In 41st International Symposium on Mathematical Foundations of Computer Science (MFCS 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 58, pp. 37:1-37:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.MFCS.2016.37,
  author =	{Fan, Chenglin and Filtser, Omrit and Katz, Matthew J. and Zhu, Binhai},
  title =	{{On the General Chain Pair Simplification Problem}},
  booktitle =	{41st International Symposium on Mathematical Foundations of Computer Science (MFCS 2016)},
  pages =	{37:1--37:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-016-3},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{58},
  editor =	{Faliszewski, Piotr and Muscholl, Anca and Niedermeier, Rolf},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2016.37},
  URN =		{urn:nbn:de:0030-drops-64510},
  doi =		{10.4230/LIPIcs.MFCS.2016.37},
  annote =	{Keywords: chain simplification, discrete Frechet distance, dynamic programming, geometric arrangements, protein structural resemblance}
}

Document

DOI: 10.4230/LIPIcs.CPM.2016.15

Genomic Scaffold Filling Revisited

Authors: Haitao Jiang, Chenglin Fan, Boting Yang, Farong Zhong, Daming Zhu, and Binhai Zhu

Published in: LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

Abstract

The genomic scaffold filling problem has attracted a lot of attention recently. The problem is on filling an incomplete sequence (scaffold) I into I', with respect to a complete reference genome G, such that the number of adjacencies between G and I' is maximized. The problem is NP-complete and APX-hard, and admits a 1.2-approximation. However, the sequence input I is not quite practical and does not fit most of the real datasets (where a scaffold is more often given as a list of contigs). In this paper, we revisit the genomic scaffold filling problem by considering this important case when, (1) a scaffold S is given, the missing genes X = c(G) - c(S) can only be inserted in between the contigs, and the objective is to maximize the number of adjacencies between G and the filled S' and (2) a scaffold S is given, a subset of the missing genes X' subset X = c(G) - c(S) can only be inserted in between the contigs, and the objective is still to maximize the number of adjacencies between G and the filled S''. For problem (1), we present a simple NP-completeness proof, we then present a factor-2 greedy approximation algorithm, and finally we show that the problem is FPT when each gene appears at most d times in G. For problem (2), we prove that the problem is W[1]-hard and then we present a factor-2 FPT-approximation for the case when each gene appears at most d times in G.

Cite as

Haitao Jiang, Chenglin Fan, Boting Yang, Farong Zhong, Daming Zhu, and Binhai Zhu. Genomic Scaffold Filling Revisited. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 15:1-15:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{jiang_et_al:LIPIcs.CPM.2016.15,
  author =	{Jiang, Haitao and Fan, Chenglin and Yang, Boting and Zhong, Farong and Zhu, Daming and Zhu, Binhai},
  title =	{{Genomic Scaffold Filling Revisited}},
  booktitle =	{27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)},
  pages =	{15:1--15:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-012-5},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{54},
  editor =	{Grossi, Roberto and Lewenstein, Moshe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.15},
  URN =		{urn:nbn:de:0030-drops-60791},
  doi =		{10.4230/LIPIcs.CPM.2016.15},
  annote =	{Keywords: Computational biology, Approximation algorithms, FPT algorithms, NP- completeness}
}

12 Search Results for "Fan, Chenglin"

Resilience in Knowledge Graph Embeddings

Abstract

Cite as

Faster Fréchet Distance Under Transformations

Abstract

Cite as

Fitting Tree Metrics and Ultrametrics in Data Streams

Abstract

Cite as

When Distances Lie: Euclidean Embeddings in the Presence of Outliers and Distance Violations

Abstract

Cite as

Near-Universally-Optimal Differentially Private Minimum Spanning Trees

Abstract

Cite as

A Faster Algorithm for Constrained Correlation Clustering

Abstract

Cite as

Linear Expected Complexity for Directional and Multiplicative Voronoi Diagrams

Abstract

Cite as

Fréchet Distance for Uncertain Curves

Abstract

Cite as

Generalized Metric Repair on Graphs

Abstract

Cite as

Computing the Fréchet Gap Distance

Abstract

Cite as

On the General Chain Pair Simplification Problem

Abstract

Cite as

Genomic Scaffold Filling Revisited

Abstract

Cite as

Thanks for your feedback!

Could not send message