2 Search Results for "Santini, Massimo"


Document
Maximum Coverage in Sublinear Space, Faster

Authors: Stephen Jaud, Anthony Wirth, and Farhana Choudhury

Published in: LIPIcs, Volume 265, 21st International Symposium on Experimental Algorithms (SEA 2023)


Abstract
Given a collection of m sets from a universe 𝒰, the Maximum Set Coverage problem consists of finding k sets whose union has largest cardinality. This problem is NP-Hard, but the solution can be approximated by a polynomial time algorithm up to a factor 1-1/e. However, this algorithm does not scale well with the input size. In a streaming context, practical high-quality solutions are found, but with space complexity that scales linearly with respect to the size of the universe n = |𝒰|. However, one randomized streaming algorithm has been shown to produce a 1-1/e-ε approximation of the optimal solution with a space complexity that scales only poly-logarithmically with respect to m and n. In order to achieve such a low space complexity, the authors used two techniques in their multi-pass approach: - F₀-sketching, allows to determine with great accuracy the number of distinct elements in a set using less space than the set itself. - Subsampling, consists of only solving the problem on a subspace of the universe. It is implemented using γ-independent hash functions. This article focuses on the sublinear-space algorithm and highlights the time cost of these two techniques, especially subsampling. We present optimizations that significantly reduce the time complexity of the algorithm. Firstly, we give some optimizations that do not alter the space complexity, number of passes and approximation quality of the original algorithm. In particular, we reanalyze the error bounds to show that the original independence factor of Ω(ε^{-2} k log m) can be fine-tuned to Ω(k log m); we also show how F₀-sketching can be removed. Secondly, we derive a new lower bound for the probability of producing a 1-1/e-ε approximation using only pairwise independence: 1- (4/(c k log m)) compared to 1-(2e/(m^{ck/6})) with Ω(k log m)-independence. Although the theoretical guarantees are weaker, suggesting the approximation quality would suffer, for large streams, our algorithms perform well in practice. Finally, our experimental results show that even a pairwise-independent hash-function sampler does not produce worse solution than the original algorithm, while running significantly faster by several orders of magnitude.

Cite as

Stephen Jaud, Anthony Wirth, and Farhana Choudhury. Maximum Coverage in Sublinear Space, Faster. In 21st International Symposium on Experimental Algorithms (SEA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 265, pp. 21:1-21:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{jaud_et_al:LIPIcs.SEA.2023.21,
  author =	{Jaud, Stephen and Wirth, Anthony and Choudhury, Farhana},
  title =	{{Maximum Coverage in Sublinear Space, Faster}},
  booktitle =	{21st International Symposium on Experimental Algorithms (SEA 2023)},
  pages =	{21:1--21:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-279-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{265},
  editor =	{Georgiadis, Loukas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2023.21},
  URN =		{urn:nbn:de:0030-drops-183715},
  doi =		{10.4230/LIPIcs.SEA.2023.21},
  annote =	{Keywords: streaming algorithms, subsampling, maximum set cover, k-wise independent hash functions}
}
Document
A Deeper Investigation of PageRank as a Function of the Damping Factor

Authors: Paolo Boldi, Massimo Santini, and Sebastiano Vigna

Published in: Dagstuhl Seminar Proceedings, Volume 7071, Web Information Retrieval and Linear Algebra Algorithms (2007)


Abstract
PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor $alpha$ that spreads uniformly part of the rank. The choice of $alpha$ is eminently empirical, and in most cases the original suggestion $alpha=0.85$ by Brin and Page is still used. In this paper, we give a mathematical analysis of PageRank when $alpha$ changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of $alpha$ close to $1$ do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and by proving that the $k$-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree $k$, we show how to obtain an approximation of the derivatives. Finally, we view PageRank as a linear operator acting on the preference vector and show a tight connection between iterated computation and derivation.

Cite as

Paolo Boldi, Massimo Santini, and Sebastiano Vigna. A Deeper Investigation of PageRank as a Function of the Damping Factor. In Web Information Retrieval and Linear Algebra Algorithms. Dagstuhl Seminar Proceedings, Volume 7071, pp. 1-19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2007)


Copy BibTex To Clipboard

@InProceedings{boldi_et_al:DagSemProc.07071.3,
  author =	{Boldi, Paolo and Santini, Massimo and Vigna, Sebastiano},
  title =	{{A Deeper Investigation of PageRank as a Function of the Damping Factor}},
  booktitle =	{Web Information Retrieval and Linear Algebra Algorithms},
  pages =	{1--19},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7071},
  editor =	{Andreas Frommer and Michael W. Mahoney and Daniel B. Szyld},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.07071.3},
  URN =		{urn:nbn:de:0030-drops-10722},
  doi =		{10.4230/DagSemProc.07071.3},
  annote =	{Keywords: PageRank, damping factor, Markov chains}
}
  • Refine by Author
  • 1 Boldi, Paolo
  • 1 Choudhury, Farhana
  • 1 Jaud, Stephen
  • 1 Santini, Massimo
  • 1 Vigna, Sebastiano
  • Show More...

  • Refine by Classification
  • 1 Theory of computation → Streaming, sublinear and near linear time algorithms

  • Refine by Keyword
  • 1 Markov chains
  • 1 PageRank
  • 1 damping factor
  • 1 k-wise independent hash functions
  • 1 maximum set cover
  • Show More...

  • Refine by Type
  • 2 document

  • Refine by Publication Year
  • 1 2007
  • 1 2023

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail