The Minimization of Random Hypergraphs

Authors Thomas Bläsius, Tobias Friedrich , Martin Schirneck



PDF
Thumbnail PDF

File

LIPIcs.ESA.2020.21.pdf
  • Filesize: 0.63 MB
  • 15 pages

Document Identifiers

Author Details

Thomas Bläsius
  • Hasso Plattner Institute, University of Potsdam, Germany
Tobias Friedrich
  • Hasso Plattner Institute, University of Potsdam, Germany
Martin Schirneck
  • Hasso Plattner Institute, University of Potsdam, Germany

Acknowledgements

The authors thank Benjamin Doerr, Timo Kötzing, and Martin Krejca for the fruitful discussions on the Chernoff-Hoeffding theorem, including valuable pointers to the literature.

Cite AsGet BibTex

Thomas Bläsius, Tobias Friedrich, and Martin Schirneck. The Minimization of Random Hypergraphs. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 21:1-21:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ESA.2020.21

Abstract

We investigate the maximum-entropy model B_{n,m,p} for random n-vertex, m-edge multi-hypergraphs with expected edge size pn. We show that the expected size of the minimization min(B_{n,m,p}), i.e., the number of inclusion-wise minimal edges of B_{n,m,p}, undergoes a phase transition with respect to m. If m is at most 1/(1-p)^{(1-p)n}, then E[|min(B_{n,m,p})|] is of order Θ(m), while for m ≥ 1/(1-p)^{(1-p+ε)n} for any ε > 0, it is Θ(2^{(H(α) + (1-α) log₂ p) n}/√n). Here, H denotes the binary entropy function and α = - (log_{1-p} m)/n. The result implies that the maximum expected number of minimal edges over all m is Θ((1+p)ⁿ/√n). Our structural findings have algorithmic implications for minimizing an input hypergraph. This has applications in the profiling of relational databases as well as for the Orthogonal Vectors problem studied in fine-grained complexity. We make several technical contributions that are of independent interest in probability. First, we improve the Chernoff-Hoeffding theorem on the tail of the binomial distribution. In detail, we show that for a binomial variable Y ∼ Bin(n,p) and any 0 < x < p, it holds that P[Y ≤ xn] = Θ(2^{-D(x‖p) n}/√n), where D is the binary Kullback-Leibler divergence between Bernoulli distributions. We give explicit upper and lower bounds on the constants hidden in the big-O notation that hold for all n. Secondly, we establish the fact that the probability of a set of cardinality i being minimal after m i.i.d. maximum-entropy trials exhibits a sharp threshold behavior at i^* = n + log_{1-p} m.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Information theory
  • Mathematics of computing → Hypergraphs
  • Theory of computation → Random network models
  • Mathematics of computing → Random graphs
Keywords
  • Chernoff-Hoeffding theorem
  • maximum entropy
  • maximization
  • minimization
  • phase transition
  • random hypergraphs

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ziawasch Abedjan, Lukasz Golab, Felix Naumann, and Thorsten Papenbrock. Data Profiling. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, San Rafael, CA, USA, 2018. URL: https://doi.org/10.2200/S00878ED1V01Y201810DTM052.
  2. Kartik Anand and Ginestra Bianconi. Entropy Measures for Networks: Toward an Information Theory of Complex Topologies. Physical Review E, 80:045102, 2009. URL: https://doi.org/10.1103/PhysRevE.80.045102.
  3. Robert B. Ash. Information Theory. Dover Books on Mathematics. Dover Publications, Mineola, NY, USA, 1990. Reprint of the Interscience Publishers 1965 edition. Google Scholar
  4. Golnaz Badkobeh, Per Kristian Lehre, and Dirk Sudholt. Black-box Complexity of Parallel Search with Distributed Populations. In Proceedings of the 2015 Conference on Foundations of Genetic Algorithms (FOGA), pages 3-15, 2015. URL: https://doi.org/10.1145/2725494.2725504.
  5. Michael Behrisch, Amin Coja-Oghlan, and Mihyun Kang. The Order of the Giant Component of Random Hypergraphs. Random Structures and Algorithms, 36:149-184, 2010. URL: https://doi.org/10.1002/rsa.v36:2.
  6. Michael Behrisch, Amin Coja-Oghlan, and Mihyun Kang. Local Limit Theorems for the Giant Component of Random Hypergraphs. Combinatorics, Probability and Computing, 23:331-–366, 2014. URL: https://doi.org/10.1017/S0963548314000017.
  7. Claude Berge. Hypergraphs - Combinatorics of Finite Sets, volume 45 of North-Holland Mathematical Library. North-Holland, Amsterdam, Netherlands, 1989. Google Scholar
  8. Ginestra Bianconi. The Entropy of Randomized Network Ensembles. Europhysics Letters, 81:28005, 2007. URL: https://doi.org/10.1209/0295-5075/81/28005.
  9. Thomas Bläsius, Tobias Friedrich, and Martin Schirneck. The Parameterized Complexity of Dependency Detection in Relational Databases. In Proceedings of the 11th International Symposium on Parameterized and Exact Computation (IPEC), pages 6:1-6:13, 2016. URL: https://doi.org/10.4230/LIPIcs.IPEC.2016.6.
  10. Tobias Bleifuß, Sebastian Kruse, and Felix Naumann. Efficient Denial Constraint Discovery with Hydra. Proceedings of the VLDB Endowment, 11:311–323, 2017. URL: https://doi.org/10.14778/3157794.3157800.
  11. Thomas Bläsius, Tobias Friedrich, Julius Lischeid, Kitty Meeks, and Martin Schirneck. Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling. In Proceedings of the 21st Meeting on Algorithm Engineering and Experiments (ALENEX), pages 130-143, 2019. URL: https://doi.org/10.1137/1.9781611975499.11.
  12. Béla Bollobás. Random Graphs. Studies in Advanced Mathematics. Cambridge University Press, Cambridge, UK, 2 edition, 2001. URL: https://doi.org/10.1017/CBO9780511814068.
  13. Béla Bollobás. A Probabilistic Proof of an Asymptotic Formula for the Number of Labelled Regular Graphs. European Journal of Combinatorics, 1:311-316, 1980. URL: https://doi.org/10.1016/S0195-6698(80)80030-8.
  14. Michele Borassi, Pierluigi Crescenzi, and Michel Habib. Into the Square: On the Complexity of Some Quadratic-time Solvable Problems. Electronic Notes in Theoretical Computer Science, 322:51-67, 2016. URL: https://doi.org/10.1016/j.entcs.2016.03.005.
  15. Aiden A. Bruen and Mario A. Forcinito. Cryptography, Information Theory, and Error-Correction: A Handbook for the 21st Century. Wiley-Interscience, New York, NY, USA, 2004. URL: https://doi.org/10.1002/9781118033296.
  16. Philip Samuel Chodrow. Configuration Models of Random Hypergraphs. ArXiv e-prints, 2019. URL: http://arxiv.org/abs/1902.09302.
  17. Colin Cooper, Alan M. Frieze, and Wesley Pegden. On the Rank of a Random Binary Matrix. Electronic Journal of Combinatorics, 26:P4.12, 2019. URL: https://doi.org/10.37236/8092.
  18. Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience, New York, NY, USA, 2nd edition, 2006. Google Scholar
  19. Harald Cramér. Sur un nouveau théorème-limite de la théorie des probabilités. In Actualités scientifiques et industrielles, volume 763, pages 5-23, 1938. Colloque consacré à la théorie des probabilités. (On a New Limit Theorem in Probability.) In French. Google Scholar
  20. J. Demetrovics, Gyula O. H. Katona, D. Miklos, O. Seleznjev, and B. Thalheim. Asymptotic Properties of Keys and Functional Dependencies in Random Databases. Theoretical Computer Science, 190:151-166, 1998. URL: https://doi.org/10.1016/S0304-3975(97)00089-3.
  21. Benajmin Doerr. Probabilistic Tools for the Analysis of Randomized Optimization Heuristics. In Benjamin Doerr and Frank Neumann, editors, Theory of Evolutionary Computation, chapter 1. Springer International, Basel, Switzerland, 2020. URL: https://doi.org/10.1007/978-3-030-29414-4.
  22. Devdatt Dubhashi and Alessandro Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, New York, NY, USA, 2009. Google Scholar
  23. Paul Erdős and Alfréd Rényi. On Random Graphs I. Publicationes Mathematicae Debrecen, 6:290-297, 1959. Google Scholar
  24. Vincent Froese, René van Bevern, Rolf Niedermeier, and Manuel Sorge. Exploiting Hidden Structure in Selecting Dimensions That Distinguish Vectors. Journal of Computer and System Sciences, 82:521-535, 2016. URL: https://doi.org/10.1016/j.jcss.2015.11.011.
  25. Jiawei Gao, Russell Impagliazzo, Antonina Kolokolova, and Ryan Williams. Completeness for First-Order Properties on Sparse Structures With Algorithmic Applications. Transactions on Algorithms, 15:23:1-23:35, 2018. URL: https://doi.org/10.1145/3196275.
  26. Diego Garlaschelli and Maria I. Loffredo. Maximum Likelihood: Extracting Unbiased Information from Complex Networks. Physical Review E, 78:015101, 2008. URL: https://doi.org/10.1103/PhysRevE.78.015101.
  27. Edgar N. Gilbert. Random Graphs. Annals of Mathematical Statistics, 30:1141-1144, 1959. URL: https://doi.org/10.1214/aoms/1177706098.
  28. Yves Grandvalet and Yoshua Bengio. Entropy Regularization. In Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien, editors, Semi-Supervised Learning, chapter 9. MIT Press, Cambridge, MA, USA, 2006. URL: https://doi.org/10.7551/mitpress/9780262033589.001.0001.
  29. Peter Harremoës. Binomial and Poisson Distributions as Maximum Entropy Distributions. Transactions on Information Theory, 47:2039-2041, 2001. URL: https://doi.org/10.1109/18.930936.
  30. Wassily Hoeffding. Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association, 58:13-30, 1963. Google Scholar
  31. Remco van der Hofstad. Random Graphs and Complex Networks, volume 1 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, UK, 2016. URL: https://doi.org/10.1017/9781316779422.
  32. Edwin Thompson Jaynes. Information Theory and Statistical Mechanics. Phyical Review Series II, 106:620-630, 1957. URL: https://doi.org/10.1103/PhysRev.106.620.
  33. Edwin Thompson Jaynes. Information Theory and Statistical Mechanics II. Phyical Review Series II, 108:171-190, 1957. URL: https://doi.org/10.1103/PhysRev.108.171.
  34. Michał Karoński and Tomasz Łuczak. The Phase Transition in a Random Hypergraph. Journal of Computational and Applied Mathematics, 142:125-135, 2002. URL: https://doi.org/10.1016/S0377-0427(01)00464-2.
  35. Gyula O. H. Katona. Random Databases with Correlated Data. In Antje Düsterhöft, Meike Klettke, and Klaus-Dieter Schewe, editors, Conceptual Modelling and Its Theoretical Foundations: Essays Dedicated to Bernhard Thalheim on the Occasion of His 60th Birthday, pages 29-35. Springer, Berlin and Heidelberg, Germany, 2012. Festschrift. URL: https://doi.org/10.1007/978-3-642-28279-9_4.
  36. Gyula O. H. Katona. Testing Functional Connection Between Two Random Variables. In Albert N. Shiryaev, S. R. S. Varadhan, and Ernst L. Presman, editors, Prokhorov and Contemporary Probability Theory, pages 335-348. Springer, Berlin and Heidelberg, Germany, 2013. Festschrift. URL: https://doi.org/10.1007/978-3-642-33549-5_20.
  37. Hiremagalur Krishnaswamy Kesavan. Jaynes' Maximum Entropy Principle. In Christodoulos A. Floudas and Panos M. Pardalos, editors, Encyclopedia of Optimization, pages 1779-1782. Springer, Boston, MA, USA, 2009. URL: https://doi.org/10.1007/978-0-387-74759-0_312.
  38. Bernhard Klar. Bounds on Tail Probabilities of Discrete Distributions. Probability in the Engineering and Informational Sciences, 14:161-171, 2000. Google Scholar
  39. Elliott H. Lieb and Jakob Yngvason. The Physics and Mathematics of the Second Law of Thermodynamics. Physics Reports, 310:1-96, 1999. URL: https://doi.org/10.1016/S0370-1573(98)00082-9.
  40. Ester Livshits, Alireza Heidari, Ihab F. Ilyas, and Benny Kimelfeld. Approximate Denial Constraints. CoRR, abs/2005.08540, 2020. Arxiv preprint. To apprear in PVLDB 13. URL: http://arxiv.org/abs/2005.08540.
  41. Michael Mitzenmacher and Eli Upfal. Probability and Computing. Cambridge University Press, New York, NY, USA, 2nd edition, 2017. Google Scholar
  42. Mark E. J. Newman. Scientific Collaboration Networks. I. Network Construction and Fundamental Results. Physical Review E, 64:016131, 2001. URL: https://doi.org/10.1103/PhysRevE.64.016131.
  43. Mark E. J. Newman. Networks: An Introduction. Oxford University Press, New York, NY, USA, 2010. URL: https://doi.org/10.1093/acprof:oso/9780199206650.001.0001.
  44. Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, Cambridge, UK, 2010. URL: https://doi.org/10.1017/CBO9780511976667.
  45. Thorsten Papenbrock, Jens Ehrlich, Jannik Marten, Tommy Neubert, Jan-Peer Rudolph, Martin Schönberg, Jakob Zwiener, and Felix Naumann. Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms. Proceedings of the VLDB Endowment, 8:1082-1093, 2015. URL: https://doi.org/10.14778/2794367.2794377.
  46. Juyong Park and Mark E. J. Newman. The Statistical Mechanics of Networks. Physical Review E, 70:066117, 2004. URL: https://doi.org/10.1103/PhysRevE.70.066117.
  47. Yuri Vasilyevich Prokhorov. Asymptotic Behavior of the Binomial Distribution. Uspekhi Matematicheskikh Nauk, 8:135-142, 1953. In Russian. Google Scholar
  48. Fabio Saracco, Riccardo Di Clemente, Andrea Gabrielli, and Tiziano Squartini. Randomizing Bipartite Networks: The Case of the World Trade Web. Scientific Reports, 5:10595, 2015. URL: https://doi.org/10.1038/srep10595.
  49. Jeanette Schmidt-Pruzan and Eli Shamir. Component Structure in the Evolution of Random Hypergraphs. Combinatorica, 5:81-94, 1985. URL: https://doi.org/10.1007/BF02579445.
  50. Claude Elwood Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27:379-423, 1948. URL: https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.
  51. Eric V. Slud. Distribution Inequalities for the Binomial Law. The Annals of Probability, 5:404-412, 1977. URL: https://projecteuclid.org/euclid.aop/1176995801.
  52. Ryan Williams. A New Algorithm for Optimal 2-Constraint Satisfaction and Its Implications. Theoretical Computer Science, 348:357-365, 2005. URL: https://doi.org/10.1016/j.tcs.2005.09.023.
  53. Katharina Anna Zweig. Network Analysis Literacy: A Practical Approach to the Analysis of Networks. Lecture Notes in Social Networks. Springer, Vienna, Austria, 2014. URL: https://doi.org/10.1007/978-3-7091-0741-6.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail