On the External Validity of Average-Case Analyses of Graph Algorithms

Authors Thomas Bläsius , Philipp Fischbeck



PDF
Thumbnail PDF

File

LIPIcs.ESA.2022.21.pdf
  • Filesize: 1.13 MB
  • 14 pages

Document Identifiers

Author Details

Thomas Bläsius
  • Karlsruhe Institute of Technology (KIT), Germany
Philipp Fischbeck
  • Hasso Plattner Institute (HPI), University of Potsdam, Germany

Cite AsGet BibTex

Thomas Bläsius and Philipp Fischbeck. On the External Validity of Average-Case Analyses of Graph Algorithms. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 21:1-21:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ESA.2022.21

Abstract

The number one criticism of average-case analysis is that we do not actually know the probability distribution of real-world inputs. Thus, analyzing an algorithm on some random model has no implications for practical performance. At its core, this criticism doubts the existence of external validity, i.e., it assumes that algorithmic behavior on the somewhat simple and clean models does not translate beyond the models to practical performance real-world input. With this paper, we provide a first step towards studying the question of external validity systematically. To this end, we evaluate the performance of six graph algorithms on a collection of 2751 sparse real-world networks depending on two properties; the heterogeneity (variance in the degree distribution) and locality (tendency of edges to connect vertices that are already close). We compare this with the performance on generated networks with varying locality and heterogeneity. We find that the performance in the idealized setting of network models translates surprisingly well to real-world networks. Moreover, heterogeneity and locality appear to be the core properties impacting the performance of many graph algorithms.

Subject Classification

ACM Subject Classification
  • Theory of computation → Graph algorithms analysis
  • Theory of computation → Randomness, geometry and discrete structures
Keywords
  • Average Case
  • Network Models
  • Empirical Evaluation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Takuya Akiba and Yoichi Iwata. Branch-and-reduce exponential/FPT algorithms in practice: A case study of vertex cover. Theoretical Computer Science, 609:211-225, 2016. URL: https://doi.org/10.1016/j.tcs.2015.09.023.
  2. Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. Science, 286(5439):509-512, 1999. URL: https://doi.org/10.1126/science.286.5439.509.
  3. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008. URL: https://doi.org/10.1088/1742-5468/2008/10/p10008.
  4. Thomas Bläsius and Philipp Fischbeck. On the external validity of average-case analyses of graph algorithms. Computing Research Repository (CoRR), abs/2205.15066, 2022. URL: https://doi.org/10.48550/arXiv.2205.15066.
  5. Thomas Bläsius, Philipp Fischbeck, Tobias Friedrich, and Maximilian Katzmann. Solving vertex cover in polynomial time on hyperbolic random graphs. Theory of Computing Systems, 2021. URL: https://doi.org/10.1007/s00224-021-10062-9.
  6. Thomas Bläsius, Philipp Fischbeck, Tobias Friedrich, and Martin Schirneck. Understanding the effectiveness of data reduction in public transportation networks. In Workshop on Algorithms and Models for the Web-Graph (WAW), pages 87-101, 2019. URL: https://doi.org/10.1007/978-3-030-25070-6_7.
  7. Thomas Bläsius, Cedric Freiberger, Tobias Friedrich, Maximilian Katzmann, Felix Montenegro-Retana, and Marianne Thieffry. Efficient shortest paths in scale-free networks with underlying hyperbolic geometry. ACM Transactions on Algorithms (TALG), 18(2):19:1-19:32, 2022. URL: https://doi.org/10.1145/3516483.
  8. Thomas Bläsius, Tobias Friedrich, Maximilian Katzmann, Ulrich Meyer, Manuel Penschuck, and Christopher Weyand. Efficiently generating geometric inhomogeneous and hyperbolic random graphs. In European Symposium on Algorithms (ESA), pages 21:1-21:14, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.21.
  9. Michele Borassi, Pierluigi Crescenzi, and Luca Trevisan. An axiomatic and an average-case analysis of algorithms and heuristics for metric properties of graphs. In ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 920-939, 2017. URL: https://doi.org/10.1137/1.9781611974782.58.
  10. Michele Borassi and Emanuele Natale. KADABRA is an ADaptive algorithm for betweenness via random approximation. ACM Journal of Experimental Algorithmics (JEA), 24(1):1.2:1-1.2:35, 2019. URL: https://doi.org/10.1145/3284359.
  11. Karl Bringmann, Ralph Keusch, and Johannes Lengler. Geometric inhomogeneous random graphs. Theoretical Computer Science, 760:35-54, 2019. URL: https://doi.org/10.1016/j.tcs.2018.08.014.
  12. Shiri Chechik, Edith Cohen, and Haim Kaplan. Average distance queries through weighted samples in graphs and metric spaces: High scalability with tight statistical guarantees. In International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX), pages 659-679, 2015. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2015.659.
  13. Jianer Chen, Iyad A. Kanj, and Ge Xia. Improved upper bounds for vertex cover. Theoretical Computer Science, 411(40-42):3736-3756, 2010. URL: https://doi.org/10.1016/j.tcs.2010.06.026.
  14. Fan Chung and Linyuan Lu. The average distances in random graphs with given expected degrees. Proceedings of the National Academy of Sciences, 99(25):15879-15882, 2002. URL: https://doi.org/10.1073/pnas.252631999.
  15. Fan Chung and Linyuan Lu. Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 6(2):125-145, 2002. URL: https://doi.org/10.1007/PL00012580.
  16. Stephen A. Cook. The complexity of theorem-proving procedures. In Symposium on the Theory of Computing (STOC), pages 151-158, 1971. URL: https://doi.org/10.1145/800157.805047.
  17. Pilu Crescenzi, Roberto Grossi, Michel Habib, Leonardo Lanzi, and Andrea Marino. On computing the diameter of real-world undirected graphs. Theoretical Computer Science, 514:84-95, 2013. URL: https://doi.org/10.1016/j.tcs.2012.09.018.
  18. Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-21275-3.
  19. Derek J. de Solla Price. A general theory of bibliometric and other cumulative advantage processes. Journal of the Association for Information Science and Technology (JASIST), 27(5):292-306, 1976. URL: https://doi.org/10.1002/asi.4630270505.
  20. David Eppstein, Maarten Löffler, and Darren Strash. Listing all maximal cliques in sparse graphs in near-optimal time. In International Symposium on Algorithms and Computation (ISAAC), pages 403-414, 2010. URL: https://doi.org/10.1007/978-3-642-17517-6_36.
  21. David Eppstein, Maarten Löffler, and Darren Strash. Listing all maximal cliques in large sparse real-world graphs. ACM Journal of Experimental Algorithmics (JEA), 18, 2013. URL: https://doi.org/10.1145/2543629.
  22. David Eppstein and Darren Strash. Listing all maximal cliques in large sparse real-world graphs. In Symposium on Experimental and Efficient Algorithms (SEA), pages 364-375, 2011. URL: https://doi.org/10.1007/978-3-642-20662-7_31.
  23. P. Erdős and A. Rényi. On random graphs I. Publicationes Mathematicae, 6:290-297, 1959. URL: https://www.renyi.hu/~p_erdos/1959-11.pdf.
  24. Jacob Fox, Tim Roughgarden, C. Seshadhri, Fan Wei, and Nicole Wein. Finding cliques in social networks: A new distribution-free model. SIAM Journal on Computing, 49(2):448-464, 2020. URL: https://doi.org/10.1137/18M1210459.
  25. Andrea Kappes. Engineering Graph Clustering Algorithms. PhD thesis, Karlsruhe Institute of Technology, 2015. URL: http://digbib.ubka.uni-karlsruhe.de/volltexte/1000049269.
  26. Richard M. Karp. Reducibility among combinatorial problems. In Computational Complexity Conference (CCC), pages 85-103, 1972. URL: https://doi.org/10.1007/978-1-4684-2001-2_9.
  27. Richard M. Karp. The probabilistic analysis of combinatorial optimization algorithms. In International Congress of Mathematicians (ICM), volume 2, pages 1601-1609, 1983. URL: https://www.mathunion.org/fileadmin/ICM/Proceedings/ICM1983.2/ICM1983.2.ocr.pdf.
  28. Clémence Magnien, Matthieu Latapy, and Michel Habib. Fast computation of empirically tight bounds for the diameter of massive graphs. ACM Journal of Experimental Algorithmics (JEA), 13, 2008. URL: https://doi.org/10.1145/1412228.1455266.
  29. Ryan A. Rossi and Nesreen K. Ahmed. The network data repository with interactive graph analytics and visualization. In AAAI Conference on Artificial Intelligence (AAAI), pages 4292-4293, 2015. URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9553.
  30. Shuji Tsukiyama, Mikio Ide, Hiromu Ariyoshi, and Isao Shirakawa. A new algorithm for generating all the maximal independent sets. SIAM Journal on Computing, 6(3):505-517, 1977. URL: https://doi.org/10.1137/0206036.
  31. Anurag Verma, Austin Buchanan, and Sergiy Butenko. Solving the maximum clique and vertex coloring problems on very large sparse networks. INFORMS Journal on Computing, 27(1):164-177, 2015. URL: https://doi.org/10.1287/ijoc.2014.0618.
  32. Duncan J. Watts and Steven H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440-442, 1998. URL: https://doi.org/10.1038/30918.
  33. Karsten Weihe. Covering trains by stations or the power of data reduction. In Algorithms and Experiments (ALEX), pages 1-8, 1998. Google Scholar
  34. Mingyu Xiao and Hiroshi Nagamochi. Exact algorithms for maximum independent set. Information and Computation, 255:126-146, 2017. URL: https://doi.org/10.1016/j.ic.2017.06.001.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail