Exact Recovery Algorithm for Planted Bipartite Graph in Semi-Random Graphs

Authors Akash Kumar, Anand Louis, Rameesh Paul



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2022.84.pdf
  • Filesize: 0.8 MB
  • 20 pages

Document Identifiers

Author Details

Akash Kumar
  • School of Computer and Communication Sciences, EPFL, Lausanne, Switzerland
Anand Louis
  • Computer Science and Automation Department, IISc, Bangalore, India
Rameesh Paul
  • Computer Science and Automation Department, IISc, Bangalore, India

Cite As Get BibTex

Akash Kumar, Anand Louis, and Rameesh Paul. Exact Recovery Algorithm for Planted Bipartite Graph in Semi-Random Graphs. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 84:1-84:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.ICALP.2022.84

Abstract

The problem of finding the largest induced balanced bipartite subgraph in a given graph is NP-hard. This problem is closely related to the problem of finding the smallest Odd Cycle Transversal. 
In this work, we consider the following model of instances: starting with a set of vertices V, a set S ⊆ V of k vertices is chosen and an arbitrary d-regular bipartite graph is added on it; edges between pairs of vertices in S× (V⧵S) and (V⧵S) × (V⧵S) are added with probability p. Since for d = 0, the problem reduces to recovering a planted independent set, we don't expect efficient algorithms for k = o(√n). This problem is a generalization of the planted balanced biclique problem where the bipartite graph induced on S is a complete bipartite graph; [Yevgeny Levanzov, 2018] gave an algorithm for recovering S in this problem when k = Ω(√n).
Our main result is an efficient algorithm that recovers (w.h.p.) the planted bipartite graph when k = Ω_p(√{n log n}) for a large range of parameters. Our results also hold for a natural semi-random model of instances, which involve the presence of a monotone adversary. Our proof shows that a natural SDP relaxation for the problem is integral by constructing an appropriate solution to it’s dual formulation. Our main technical contribution is a new approach for construction the dual solution where we calibrate the eigenvectors of the adjacency matrix to be the eigenvectors of the dual matrix. We believe that this approach may have applications to other recovery problems in semi-random models as well.
When k = Ω(√n), we give an algorithm for recovering S whose running time is exponential in the number of small eigenvalues in graph induced on S; this algorithm is based on subspace enumeration techniques due to the works of [Alexandra Kolla and Madhur Tulsiani, 2007; Arora et al., 2010; Kolla, 2011].

Subject Classification

ACM Subject Classification
  • Theory of computation → Semidefinite programming
  • Theory of computation → Graph algorithms analysis
  • Theory of computation → Approximation algorithms analysis
Keywords
  • SDP duality
  • Planted models
  • Semi-random models
  • Exact recovery
  • Threshold rank
  • Spectral embedding
  • Subspace enumeration

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Emmanuel Abbe, Afonso S. Bandeira, Annina Bracher, and Amit Singer. Decoding binary node labels from censored edge measurements: phase transition and efficient recovery. IEEE Trans. Network Sci. Eng., 1(1):10-22, 2014. URL: https://doi.org/10.1109/TNSE.2014.2368716.
  2. Emmanuel Abbe, Afonso S. Bandeira, and Georgina Hall. Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory, 62(1):471-487, 2016. URL: https://doi.org/10.1109/TIT.2015.2490670.
  3. Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. O(√log n) approximation algorithms for Min UnCut, Min 2CNF deletion, and directed cut problems. In STOC'05: Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pages 573-581. ACM, New York, 2005. URL: https://doi.org/10.1145/1060590.1060675.
  4. A. Agrawal, P. Klein, S. Rao, and R. Ravi. Approximation through multicommodity flow. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 726-737 vol.2, Los Alamitos, CA, USA, October 1990. IEEE Computer Society. URL: https://doi.org/10.1109/FSCS.1990.89595.
  5. Noga Alon and Nabil Kahalé. A spectral technique for coloring random 3-colorable graphs. SIAM J. Comput., 26(6):1733-1748, 1997. Google Scholar
  6. Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique in a random graph. In Proceedings of the Eighth International Conference "Random Structures and Algorithms" (Poznan, 1997), volume 13, pages 457-466, 1998. URL: https://doi.org/10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.3.CO;2-K.
  7. Claudio Arbib and Raffaele Mosca. Polynomial algorithms for special cases of the balanced complete bipartite subgraph problem. JCMCC. The Journal of Combinatorial Mathematics and Combinatorial Computing, 30, January 1999. Google Scholar
  8. Sanjeev Arora, Boaz Barak, and David Steurer. Subexponential algorithms for unique games and related problems. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science - FOCS 2010, pages 563-572. IEEE Computer Soc., Los Alamitos, CA, 2010. Google Scholar
  9. Sanjeev Arora and Rong Ge. New tools for graph coloring. In Approximation, randomization, and combinatorial optimization, volume 6845 of Lecture Notes in Comput. Sci., pages 1-12. Springer, Heidelberg, 2011. URL: https://doi.org/10.1007/978-3-642-22935-0_1.
  10. Nikhil Bansal and Subhash Khot. Optimal long code test with one free bit. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science - FOCS 2009, pages 453-462. IEEE Computer Soc., Los Alamitos, CA, 2009. URL: https://doi.org/10.1109/FOCS.2009.23.
  11. Boaz Barak, Samuel B. Hopkins, Jonathan Kelner, Pravesh Kothari, Ankur Moitra, and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. In 57th Annual IEEE Symposium on Foundations of Computer Science - FOCS 2016, pages 428-437. IEEE Computer Soc., Los Alamitos, CA, 2016. Google Scholar
  12. Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science - FOCS 2011, pages 472-481. IEEE Computer Soc., Los Alamitos, CA, 2011. URL: https://doi.org/10.1109/FOCS.2011.95.
  13. Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan. Detecting high log-densities - an O(n^1/4) approximation for densest k-subgraph. In STOC'10 - Proceedings of the 2010 ACM International Symposium on Theory of Computing, pages 201-210. ACM, New York, 2010. Google Scholar
  14. Avrim Blum and Joel Spencer. Coloring random and semi-random k-colorable graphs. J. Algorithms, 19(2):204-234, 1995. Google Scholar
  15. Ravi B. Boppana. Eigenvalues and graph bisection: An average-case analysis. In 28th Annual Symposium on Foundations of Computer Science (sfcs 1987), pages 280-285, 1987. URL: https://doi.org/10.1109/SFCS.1987.22.
  16. T. N. Bui, S. Chaudhuri, F. T. Leighton, and M. Sipser. Graph bisection algorithms with good average case behavior. Combinatorica, 7(2):171-191, 1987. URL: https://doi.org/10.1007/BF02579448.
  17. Ted Carson and Russell Impagliazzo. Hill-climbing finds random planted bisections. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (Washington, DC, 2001), pages 903-909. SIAM, Philadelphia, PA, 2001. Google Scholar
  18. Yudong Chen and Jiaming Xu. Statistical-computational tradeoffs in planted problems and submatrix localization with a growing number of clusters and submatrices. J. Mach. Learn. Res., 17:Paper No. 27, 57, 2016. Google Scholar
  19. Yizong Cheng and George M. Church. Biclustering of expression data. In Philip E. Bourne, Michael Gribskov, Russ B. Altman, Nancy Jensen, Debra A. Hope, Thomas Lengauer, Julie C. Mitchell, Eric D. Scheeff, Chris Smith, Shawn Strande, and Helge Weissig, editors, ISMB, pages 93-103. AAAI, 2000. Google Scholar
  20. Amin Coja-Oghlan. Colouring semirandom graphs. Combin. Probab. Comput., 16(4):515-552, 2007. URL: https://doi.org/10.1017/S0963548306007917.
  21. Anne Condon and Richard M. Karp. Algorithms for graph partitioning on the planted partition model. Random Structures Algorithms, 18(2):116-140, 2001. URL: https://doi.org/10.1002/1098-2418(200103)18:2<116::AID-RSA1001>3.0.CO;2-2.
  22. M. E. Dyer and A. M. Frieze. The solution of some random NP-hard problems in polynomial expected time. J. Algorithms, 10(4):451-489, 1989. URL: https://doi.org/10.1016/0196-6774(89)90001-1.
  23. David Eppstein. Arboricity and bipartite subgraph listing algorithms. Inform. Process. Lett., 51(4):207-211, 1994. URL: https://doi.org/10.1016/0020-0190(94)90121-X.
  24. Uriel Feige and Joe Kilian. Heuristics for semirandom graph problems. Journal of Computing and System Sciences, 63:639-671, 2001. Google Scholar
  25. Uriel Feige and Shimon Kogan. Hardness of approximation of the balanced complete bipartite subgraph problem. Technical report, Weizmann Institute, 2004. Google Scholar
  26. Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden clique in a semirandom graph. Random Structures Algorithms, 16(2):195-208, 2000. URL: https://doi.org/10.1002/(SICI)1098-2418(200003)16:2<195::AID-RSA5>3.3.CO;2-1.
  27. Uriel Feige and Dorit Ron. Finding hidden cliques in linear time. In 21st International Meeting on Probabilistic, Combinatorial, and Asymptotic Methods in the Analysis of Algorithms (AofA'10), Discrete Math. Theor. Comput. Sci. Proc., AM, pages 189-203. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2010. Google Scholar
  28. Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh S. Vempala, and Ying Xiao. Statistical algorithms and a lower bound for detecting planted cliques. In STOC'13 - Proceedings of the 2013 ACM Symposium on Theory of Computing, pages 655-664. ACM, New York, 2013. URL: https://doi.org/10.1145/2488608.2488692.
  29. Michael R. Garey and David S. Johnson. Computers and intractability. A Series of Books in the Mathematical Sciences. W. H. Freeman and Co., San Francisco, Calif., 1979. A guide to the theory of NP-completeness. Google Scholar
  30. Naveen Garg, Vijay Vazirani, and Mihalis Yannakakis. Approximate max-flow min-(multi)cut theorems and their applications. SIAM Journal on Computing, 25, January 1998. URL: https://doi.org/10.1137/S0097539793243016.
  31. Shayan Oveis Gharan and Luca Trevisan. Improved arv rounding in small-set expanders and graphs of bounded threshold rank, 2013. URL: http://arxiv.org/abs/1304.2060.
  32. Suprovat Ghoshal and Anand Louis. Approximation algorithms and hardness for strong unique games. In Dániel Marx, editor, Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pages 414-433. SIAM, 2021. URL: https://doi.org/10.1137/1.9781611976465.26.
  33. Suprovat Ghoshal, Anand Louis, and Rahul Raychaudhury. Approximation algorithms for partially colorable graphs. In Approximation, randomization, and combinatorial optimization. Algorithms and techniques, volume 145 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 28, 20. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2019. Google Scholar
  34. Bruce Hajek, Yihong Wu, and Jiaming Xu. Achieving exact cluster recovery threshold via semidefinite programming. IEEE Trans. Inform. Theory, 62(5):2788-2797, 2016. URL: https://doi.org/10.1109/TIT.2016.2546280.
  35. Bruce Hajek, Yihong Wu, and Jiaming Xu. Achieving exact cluster recovery threshold via semidefinite programming: extensions. IEEE Trans. Inform. Theory, 62(10):5918-5937, 2016. URL: https://doi.org/10.1109/TIT.2016.2594812.
  36. Bruce Hajek, Yihong Wu, and Jiaming Xu. Semidefinite programs for exact recovery of a hidden community, 2016. URL: http://arxiv.org/abs/1602.06410.
  37. Mark Jerrum and Gregory B. Sorkin. The Metropolis algorithm for graph bisection. Discrete Appl. Math., 82(1-3):155-175, 1998. URL: https://doi.org/10.1016/S0166-218X(97)00133-9.
  38. David S Johnson. The np-completeness column: An ongoing guide. Journal of Algorithms, 8(3):438-448, 1987. URL: https://doi.org/10.1016/0196-6774(87)90021-6.
  39. Yash Khanna and Anand Louis. Planted models for the densest k-subgraph problem. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 182 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. 27, 18. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2020. Google Scholar
  40. Yash Khanna, Anand Louis, and Rameesh Paul. Independent sets in semi-random hypergraphs. In Anna Lubiw, Mohammad Salavatipour, and Meng He, editors, Algorithms and Data Structures, pages 528-542, Cham, 2021. Springer International Publishing. Google Scholar
  41. Alexandra Kolla. Spectral algorithms for unique games. Comput. Complexity, 20(2):177-206, 2011. URL: https://doi.org/10.1007/s00037-011-0011-7.
  42. Alexandra Kolla and Madhur Tulsiani. Playing random and expanding unique games, 2007. Google Scholar
  43. Ludek Kucera. Expected complexity of graph partitioning problems. Discret. Appl. Math., 57(2-3):193-212, 1995. URL: https://doi.org/10.1016/0166-218X(94)00103-K.
  44. Akash Kumar, Anand Louis, and Madhur Tulsiani. Finding pseudorandom colorings of pseudorandom graphs. In 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 93 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 37, 12. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2017. Google Scholar
  45. James R. Lee, Shayan Oveis Gharan, and Luca Trevisan. Multi-way spectral partitioning and higher-order Cheeger inequalities. In STOC'12 - Proceedings of the 2012 ACM Symposium on Theory of Computing, pages 1117-1130. ACM, New York, 2012. URL: https://doi.org/10.1145/2213977.2214078.
  46. Yevgeny Levanzov. On finding large cliques in random and semi-random graphs. Master’s thesis, Weizmann Institute of Science, January 2018. Google Scholar
  47. Anand Louis, Prasad Raghavendra, Prasad Tetali, and Santosh Vempala. Algorithmic extensions of Cheeger’s inequality to higher eigenvalues and partitions. In Approximation, randomization, and combinatorial optimization, volume 6845 of Lecture Notes in Comput. Sci., pages 315-326. Springer, Heidelberg, 2011. URL: https://doi.org/10.1007/978-3-642-22935-0_27.
  48. Anand Louis, Prasad Raghavendra, Prasad Tetali, and Santosh Vempala. Many sparse cuts via higher eigenvalues. In STOC'12 - Proceedings of the 2012 ACM Symposium on Theory of Computing, pages 1131-1140. ACM, New York, 2012. URL: https://doi.org/10.1145/2213977.2214079.
  49. Anand Louis and Rakesh Venkat. Semi-random graphs with planted sparse vertex cuts: algorithms for exact and approximate recovery. In 45th International Colloquium on Automata, Languages, and Programming, volume 107 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 101, 15. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2018. Google Scholar
  50. Anand Louis and Rakesh Venkat. Planted models for k-way edge and vertex expansion. In 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, volume 150 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 23, 15. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2019. Google Scholar
  51. Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. Approximation algorithms for semi-random partitioning problems. In STOC'12 - Proceedings of the 2012 ACM Symposium on Theory of Computing, pages 367-384. ACM, New York, 2012. URL: https://doi.org/10.1145/2213977.2214013.
  52. Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. Constant factor approximation for balanced cut in the PIE model. In STOC'14 - Proceedings of the 2014 ACM Symposium on Theory of Computing, pages 41-49. ACM, New York, 2014. Google Scholar
  53. Pasin Manurangsi. Inapproximability of maximum edge biclique, maximum balanced biclique and minimum k-cut from the small set expansion hypothesis. In 44th International Colloquium on Automata, Languages, and Programming, volume 80 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 79, 14. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2017. Google Scholar
  54. Theo McKenzie, Hermish Mehta, and Luca Trevisan. A new algorithm for the robust semi-random independent set problem. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 738-746. SIAM, 2020. URL: https://doi.org/10.1137/1.9781611975994.45.
  55. Frank McSherry. Spectral partitioning of random graphs. In 42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001), pages 529-537. IEEE Computer Soc., Los Alamitos, CA, 2001. Google Scholar
  56. Michael Mitzenmacher and Eli Upfal. Probability and computing. Cambridge University Press, Cambridge, second edition, 2017. Randomization and probabilistic techniques in algorithms and data analysis. Google Scholar
  57. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In Thomas G. Dietterich, Suzanna Becker, and Zoubin Ghahramani, editors, Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada], pages 849-856. MIT Press, 2001. URL: https://proceedings.neurips.cc/paper/2001/hash/801272ee79cfde7fa5960571fee36b9b-Abstract.html.
  58. René Peeters. The maximum edge biclique problem is NP-complete. Discrete Appl. Math., 131(3):651-654, 2003. URL: https://doi.org/10.1016/S0166-218X(03)00333-0.
  59. Tim Roughgarden. Beyond the Worst-Case Analysis of Algorithms. Cambridge University Press, 2021. URL: https://doi.org/10.1017/9781108637435.
  60. A. Tanay, R. Sharan, and R. Shamir. Discovering statistically significant biclusters in gene expression data. Bioinformatics, 18 Suppl 1:S136-44, 2002. Google Scholar
  61. Roman Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018. An introduction with applications in data science, With a foreword by Sara van de Geer. URL: https://doi.org/10.1017/9781108231596.
  62. Van Vu. A simple SVD algorithm for finding hidden partitions. Combin. Probab. Comput., 27(1):124-140, 2018. URL: https://doi.org/10.1017/S0963548317000463.
  63. Van H. Vu. Spectral norm of random matrices. Combinatorica, 27(6):721-736, 2007. URL: https://doi.org/10.1007/s00493-007-2190-z.
  64. Mihalis Yannakakis. Node-and edge-deletion np-complete problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC '78, pages 253-264, New York, NY, USA, 1978. Association for Computing Machinery. URL: https://doi.org/10.1145/800133.804355.
  65. Yun Zhang. Elissa J. Chesler, Michael A. Langston: On finding bicliques in bipartite graphs: a novel algorithm with application to the integration of diverse biological data types. In 41st Hawaii International International Conference on Systems Science (HICSS-41 2008), Proceedings, 7-10 January 2008, Waikoloa, Big Island, HI, USA, page 473. IEEE Computer Society, 2008. URL: https://doi.org/10.1109/HICSS.2008.507.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail