DROPS

Document

DOI: 10.4230/LIPIcs.ESA.2024.29

A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels

Authors: Jean-Daniel Boissonnat and Kunal Dutta

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

Abstract

Computing persistent homology of large datasets using Gaussian kernels is useful in the domains of topological data analysis and machine learning as shown by Phillips, Wang and Zheng [SoCG 2015]. However, unlike in the case of persistent homology computation using the Euclidean distance or the k-distance, using Gaussian kernels involves significantly higher overhead, as all distance computations are in terms of the Gaussian kernel distance which is computationally more expensive. Further, most algorithmic implementations (e.g. Gudhi, Ripser, etc.) are based on Euclidean distances, so the question of finding a Euclidean embedding - preferably low-dimensional - that preserves the persistent homology computed with Gaussian kernels, is quite important. We consider the Gaussian kernel power distance (GKPD) given by Phillips, Wang and Zheng. Given an n-point dataset and a relative error parameter {ε} ∈ (0,1], we show that the persistent homology of the {Čech } filtration of the dataset computed using the GKPD can be approximately preserved using O({ε}^{-2}log n) dimensions, under a high stable rank condition. Our results also extend to the Delaunay filtration and the (simpler) case of the weighted Rips filtrations constructed using the GKPD. Compared to the Euclidean embedding for the Gaussian kernel function in ∼ n dimensions, which uses the Cholesky decomposition of the matrix of the kernel function applied to all pairs of data points, our embedding may also be viewed as dimensionality reduction - reducing the dimensionality from n to ∼ log n dimensions. Our proof utilizes the embedding of Chen and Phillips [ALT 2017], based on the Random Fourier Functions of Rahimi and Recht [NeurIPS 2007], together with two novel ingredients. The first one is a new decomposition of the squared radii of {Čech } simplices computed using the GKPD, in terms of the pairwise GKPDs between the vertices, which we state and prove. The second is a new concentration inequality for sums of cosine functions of Gaussian random vectors, which we call Gaussian cosine chaoses. We believe these are of independent interest and will find other applications in future.

Cite as

Jean-Daniel Boissonnat and Kunal Dutta. A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 29:1-29:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{boissonnat_et_al:LIPIcs.ESA.2024.29,
  author =	{Boissonnat, Jean-Daniel and Dutta, Kunal},
  title =	{{A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{29:1--29:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.29},
  URN =		{urn:nbn:de:0030-drops-211009},
  doi =		{10.4230/LIPIcs.ESA.2024.29},
  annote =	{Keywords: Persistent homology, Gaussian kernels, Random Fourier Features, Euclidean embedding}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2024.57

Non-Linear Paging

Authors: Ilan Doron-Arad and Joseph (Seffi) Naor

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

We formulate and study non-linear paging - a broad model of online paging where the size of subsets of pages is determined by a monotone non-linear set function of the pages. This model captures the well-studied classic weighted paging and generalized paging problems, and also submodular and supermodular paging, studied here for the first time, that have a range of applications from virtual memory to machine learning. Unlike classic paging, the cache threshold parameter k does not yield good competitive ratios for non-linear paging. Instead, we introduce a novel parameter 𝓁 that generalizes the notion of cache size to the non-linear setting. We obtain a tight deterministic 𝓁-competitive algorithm for general non-linear paging and a o(log²𝓁)-competitive lower bound for randomized algorithms. Our algorithm is based on a new generic LP for the problem that captures both submodular and supermodular paging, in contrast to LPs used for submodular cover settings. We finally focus on the supermodular paging problem, which is a variant of online set cover and online submodular cover, where sets are repeatedly requested to be removed from the cover. We obtain polylogarithmic lower and upper bounds and an offline approximation algorithm.

Cite as

Ilan Doron-Arad and Joseph (Seffi) Naor. Non-Linear Paging. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 57:1-57:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{doronarad_et_al:LIPIcs.ICALP.2024.57,
  author =	{Doron-Arad, Ilan and Naor, Joseph (Seffi)},
  title =	{{Non-Linear Paging}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{57:1--57:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.57},
  URN =		{urn:nbn:de:0030-drops-202000},
  doi =		{10.4230/LIPIcs.ICALP.2024.57},
  annote =	{Keywords: paging, competitive analysis, non-linear paging, submodular and supermodular functions}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2018.31

Graph Reconstruction by Discrete Morse Theory

Authors: Tamal K. Dey, Jiayuan Wang, and Yusu Wang

Published in: LIPIcs, Volume 99, 34th International Symposium on Computational Geometry (SoCG 2018)

Abstract

Recovering hidden graph-like structures from potentially noisy data is a fundamental task in modern data analysis. Recently, a persistence-guided discrete Morse-based framework to extract a geometric graph from low-dimensional data has become popular. However, to date, there is very limited theoretical understanding of this framework in terms of graph reconstruction. This paper makes a first step towards closing this gap. Specifically, first, leveraging existing theoretical understanding of persistence-guided discrete Morse cancellation, we provide a simplified version of the existing discrete Morse-based graph reconstruction algorithm. We then introduce a simple and natural noise model and show that the aforementioned framework can correctly reconstruct a graph under this noise model, in the sense that it has the same loop structure as the hidden ground-truth graph, and is also geometrically close. We also provide some experimental results for our simplified graph-reconstruction algorithm.

Cite as

Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Graph Reconstruction by Discrete Morse Theory. In 34th International Symposium on Computational Geometry (SoCG 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 99, pp. 31:1-31:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{dey_et_al:LIPIcs.SoCG.2018.31,
  author =	{Dey, Tamal K. and Wang, Jiayuan and Wang, Yusu},
  title =	{{Graph Reconstruction by Discrete Morse Theory}},
  booktitle =	{34th International Symposium on Computational Geometry (SoCG 2018)},
  pages =	{31:1--31:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-066-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{99},
  editor =	{Speckmann, Bettina and T\'{o}th, Csaba D.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2018.31},
  URN =		{urn:nbn:de:0030-drops-87443},
  doi =		{10.4230/LIPIcs.SoCG.2018.31},
  annote =	{Keywords: graph reconstruction, discrete Morse theory, persistence}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2017.23

Declutter and Resample: Towards Parameter Free Denoising

Authors: Mickael Buchet, Tamal K. Dey, Jiayuan Wang, and Yusu Wang

Published in: LIPIcs, Volume 77, 33rd International Symposium on Computational Geometry (SoCG 2017)

Abstract

In many data analysis applications the following scenario is commonplace: we are given a point set that is supposed to sample a hidden ground truth K in a metric space, but it got corrupted with noise so that some of the data points lie far away from K creating outliers also termed as ambient noise. One of the main goals of denoising algorithms is to eliminate such noise so that the curated data lie within a bounded Hausdorff distance of K. Popular denoising approaches such as deconvolution and thresholding often require the user to set several parameters and/or to choose an appropriate noise model while guaranteeing only asymptotic convergence. Our goal is to lighten this burden as much as possible while ensuring theoretical guarantees in all cases. Specifically, first, we propose a simple denoising algorithm that requires only a single parameter but provides a theoretical guarantee on the quality of the output on general input points. We argue that this single parameter cannot be avoided. We next present a simple algorithm that avoids even this parameter by paying for it with a slight strengthening of the sampling condition on the input points which is not unrealistic. We also provide some preliminary empirical evidence that our algorithms are effective in practice.

Cite as

Mickael Buchet, Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Declutter and Resample: Towards Parameter Free Denoising. In 33rd International Symposium on Computational Geometry (SoCG 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 77, pp. 23:1-23:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{buchet_et_al:LIPIcs.SoCG.2017.23,
  author =	{Buchet, Mickael and Dey, Tamal K. and Wang, Jiayuan and Wang, Yusu},
  title =	{{Declutter and Resample: Towards Parameter Free Denoising}},
  booktitle =	{33rd International Symposium on Computational Geometry (SoCG 2017)},
  pages =	{23:1--23:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-038-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{77},
  editor =	{Aronov, Boris and Katz, Matthew J.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2017.23},
  URN =		{urn:nbn:de:0030-drops-72133},
  doi =		{10.4230/LIPIcs.SoCG.2017.23},
  annote =	{Keywords: denoising, parameter free, k-distance,compact sets}
}

4 Search Results for "Wang, Jiayuan"

A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels

Abstract

Cite as

Non-Linear Paging

Abstract

Cite as

Graph Reconstruction by Discrete Morse Theory

Abstract

Cite as

Declutter and Resample: Towards Parameter Free Denoising

Abstract

Cite as

Thanks for your feedback!

Could not send message