5 Search Results for "Jacquet, Philippe"


Document
Bit-Array-Based Alternatives to HyperLogLog

Authors: Svante Janson, Jérémie Lumbroso, and Robert Sedgewick

Published in: LIPIcs, Volume 302, 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024)


Abstract
We present a family of algorithms for the problem of estimating the number of distinct items in an input stream that are simple to implement and are appropriate for practical applications. Our algorithms are a logical extension of the series of algorithms developed by Flajolet and his coauthors starting in 1983 that culminated in the widely used HyperLogLog algorithm. These algorithms divide the input stream into M substreams and lead to a time-accuracy tradeoff where a constant number of bits per substream are saved to achieve a relative accuracy proportional to 1/√M. Our algorithms use just one or two bits per substream. Their effectiveness is demonstrated by a proof of approximate normality, with explicit expressions for standard errors that inform parameter settings and allow proper quantitative comparisons with other methods. Hypotheses about performance are validated through experiments using a realistic input stream, with the conclusion that our algorithms are more accurate than HyperLogLog when using the same amount of memory, and they use two-thirds as much memory as HyperLogLog to achieve a given accuracy.

Cite as

Svante Janson, Jérémie Lumbroso, and Robert Sedgewick. Bit-Array-Based Alternatives to HyperLogLog. In 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 302, pp. 5:1-5:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{janson_et_al:LIPIcs.AofA.2024.5,
  author =	{Janson, Svante and Lumbroso, J\'{e}r\'{e}mie and Sedgewick, Robert},
  title =	{{Bit-Array-Based Alternatives to HyperLogLog}},
  booktitle =	{35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024)},
  pages =	{5:1--5:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-329-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{302},
  editor =	{Mailler, C\'{e}cile and Wild, Sebastian},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AofA.2024.5},
  URN =		{urn:nbn:de:0030-drops-204402},
  doi =		{10.4230/LIPIcs.AofA.2024.5},
  annote =	{Keywords: Cardinality estimation, sketching, Hyperloglog}
}
Document
Depth-First Search Performance in Random Digraphs

Authors: Philippe Jacquet and Svante Janson

Published in: LIPIcs, Volume 302, 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024)


Abstract
We present an analysis of the depth-first search algorithm in a random digraph model with independent outdegrees having an arbitrary distribution with finite variance. The results include asymptotics for the distribution of the stack index and depths of the search. The search yields a series of trees of finite size before and after the exploration of a giant tree. Our analysis mainly concerns the giant tree. Most results are first order. This analysis proposed by Donald Knuth in his next to appear volume of The Art of Computer Programming gives interesting insight in one of the most elegant and efficient algorithm for graph analysis due to Tarjan.

Cite as

Philippe Jacquet and Svante Janson. Depth-First Search Performance in Random Digraphs. In 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 302, pp. 30:1-30:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{jacquet_et_al:LIPIcs.AofA.2024.30,
  author =	{Jacquet, Philippe and Janson, Svante},
  title =	{{Depth-First Search Performance in Random Digraphs}},
  booktitle =	{35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024)},
  pages =	{30:1--30:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-329-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{302},
  editor =	{Mailler, C\'{e}cile and Wild, Sebastian},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AofA.2024.30},
  URN =		{urn:nbn:de:0030-drops-204655},
  doi =		{10.4230/LIPIcs.AofA.2024.30},
  annote =	{Keywords: Depth First Search, random digraph, Analysis of Algorithms}
}
Document
Depth-First Search Performance in a Random Digraph with Geometric Degree Distribution

Authors: Philippe Jacquet and Svante Janson

Published in: LIPIcs, Volume 225, 33rd International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2022)


Abstract
We present an analysis of the depth-first search algorithm in a random digraph model with geometric outdegree distribution. We give also some extensions to general outdegree distributions. This problem posed by Donald Knuth in his next to appear volume of The Art of Computer Programming gives interesting insight in one of the most elegant and efficient algorithm for graph analysis due to Tarjan.

Cite as

Philippe Jacquet and Svante Janson. Depth-First Search Performance in a Random Digraph with Geometric Degree Distribution. In 33rd International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 225, pp. 11:1-11:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{jacquet_et_al:LIPIcs.AofA.2022.11,
  author =	{Jacquet, Philippe and Janson, Svante},
  title =	{{Depth-First Search Performance in a Random Digraph with Geometric Degree Distribution}},
  booktitle =	{33rd International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2022)},
  pages =	{11:1--11:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-230-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{225},
  editor =	{Ward, Mark Daniel},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AofA.2022.11},
  URN =		{urn:nbn:de:0030-drops-160978},
  doi =		{10.4230/LIPIcs.AofA.2022.11},
  annote =	{Keywords: Combinatorics, Depth-First Search, Random Digraphs}
}
Document
Analysis of Lempel-Ziv'78 for Markov Sources

Authors: Philippe Jacquet and Wojciech Szpankowski

Published in: LIPIcs, Volume 159, 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020)


Abstract
Lempel-Ziv'78 is one of the most popular data compression algorithms. Over the last few decades fascinating properties of LZ78 were uncovered. Among others, in 1995 we settled the Ziv conjecture by proving that for a memoryless source the number of LZ78 phrases satisfies the Central Limit Theorem (CLT). Since then the quest commenced to extend it to Markov sources. However, despite several attempts this problem is still open. The 1995 proof of the Ziv conjecture was based on two models: In the DST-model, the associated digital search tree (DST) is built over m independent strings. In the LZ-model a single string of length n is partitioned into variable length phrases such that the next phrase is not seen in the past as a phrase. The Ziv conjecture for memoryless source was settled by proving that both DST-model and the LZ-model are asymptotically equivalent. The main result of this paper shows that this is not the case for the LZ78 algorithm over Markov sources. In addition, we develop here a large deviation for the number of phrases in the LZ78 and give a precise asymptotic expression for the redundancy which is the excess of LZ78 code over the entropy of the source. We establish these findings using a combination of combinatorial and analytic tools. In particular, to handle the strong dependency between Markov phrases, we introduce and precisely analyze the so called tail symbol which is the first symbol of the next phrase in the LZ78 parsing.

Cite as

Philippe Jacquet and Wojciech Szpankowski. Analysis of Lempel-Ziv'78 for Markov Sources. In 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 159, pp. 15:1-15:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{jacquet_et_al:LIPIcs.AofA.2020.15,
  author =	{Jacquet, Philippe and Szpankowski, Wojciech},
  title =	{{Analysis of Lempel-Ziv'78 for Markov Sources}},
  booktitle =	{31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020)},
  pages =	{15:1--15:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-147-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{159},
  editor =	{Drmota, Michael and Heuberger, Clemens},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AofA.2020.15},
  URN =		{urn:nbn:de:0030-drops-120459},
  doi =		{10.4230/LIPIcs.AofA.2020.15},
  annote =	{Keywords: Lempel-Ziv algorithm, digital search trees, depoissonization, analytic combinatorics, large deviations}
}
Document
Power-Law Degree Distribution in the Connected Component of a Duplication Graph

Authors: Philippe Jacquet, Krzysztof Turowski, and Wojciech Szpankowski

Published in: LIPIcs, Volume 159, 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020)


Abstract
We study the partial duplication dynamic graph model, introduced by Bhan et al. in [Bhan et al., 2002] in which a newly arrived node selects randomly an existing node and connects with probability p to its neighbors. Such a dynamic network is widely considered to be a good model for various biological networks such as protein-protein interaction networks. This model is discussed in numerous publications with only a few recent rigorous results, especially for the degree distribution. Recently Jordan [Jordan, 2018] proved that for 0 < p < 1/e the degree distribution of the connected component is stationary with approximately a power law. In this paper we rigorously prove that the tail is indeed a true power law, that is, we show that the degree of a randomly selected node in the connected component decays like C/k^β where C an explicit constant and β ≠ 2 is a non-trivial solution of p^(β-2) + β - 3 = 0. This holds regardless of the structure of the initial graph, as long as it is connected and has at least two vertices. To establish this finding we apply analytic combinatorics tools, in particular Mellin transform and singularity analysis.

Cite as

Philippe Jacquet, Krzysztof Turowski, and Wojciech Szpankowski. Power-Law Degree Distribution in the Connected Component of a Duplication Graph. In 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 159, pp. 16:1-16:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{jacquet_et_al:LIPIcs.AofA.2020.16,
  author =	{Jacquet, Philippe and Turowski, Krzysztof and Szpankowski, Wojciech},
  title =	{{Power-Law Degree Distribution in the Connected Component of a Duplication Graph}},
  booktitle =	{31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020)},
  pages =	{16:1--16:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-147-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{159},
  editor =	{Drmota, Michael and Heuberger, Clemens},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AofA.2020.16},
  URN =		{urn:nbn:de:0030-drops-120467},
  doi =		{10.4230/LIPIcs.AofA.2020.16},
  annote =	{Keywords: random graphs, pure duplication model, degree distribution, tail exponent, analytic combinatorics}
}
  • Refine by Author
  • 4 Jacquet, Philippe
  • 3 Janson, Svante
  • 2 Szpankowski, Wojciech
  • 1 Lumbroso, Jérémie
  • 1 Sedgewick, Robert
  • Show More...

  • Refine by Classification
  • 1 Mathematics of computing → Information theory
  • 1 Mathematics of computing → Random graphs
  • 1 Mathematics of computing → Trees
  • 1 Theory of computation → Graph algorithms analysis
  • 1 Theory of computation → Random network models
  • Show More...

  • Refine by Keyword
  • 2 analytic combinatorics
  • 1 Analysis of Algorithms
  • 1 Cardinality estimation
  • 1 Combinatorics
  • 1 Depth First Search
  • Show More...

  • Refine by Type
  • 5 document

  • Refine by Publication Year
  • 2 2020
  • 2 2024
  • 1 2022