Parameterized Low-Rank Binary Matrix Approximation

Authors Fedor V. Fomin , Petr A. Golovach , Fahad Panolan



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.53.pdf
  • Filesize: 0.57 MB
  • 16 pages

Document Identifiers

Author Details

Fedor V. Fomin
  • Department of Informatics, University of Bergen, Norway
Petr A. Golovach
  • Department of Informatics, University of Bergen, Norway
Fahad Panolan
  • Department of Informatics, University of Bergen, Norway

Cite As Get BibTex

Fedor V. Fomin, Petr A. Golovach, and Fahad Panolan. Parameterized Low-Rank Binary Matrix Approximation. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 53:1-53:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ICALP.2018.53

Abstract

We provide a number of algorithmic results for the following family of problems: For a given binary m x n matrix A and a nonnegative integer k, decide whether there is a "simple" binary matrix B which differs from A in at most k entries. For an integer r, the "simplicity" of B is characterized as follows.
- Binary r-Means: Matrix B has at most r different columns. This problem is known to be NP-complete already for r=2. We show that the problem is solvable in time 2^{O(k log k)}*(nm)^O(1) and thus is fixed-parameter tractable parameterized by k. We also complement this result by showing that when being parameterized by r and k, the problem admits an algorithm of running time 2^{O(r^{3/2}* sqrt{k log k})}(nm)^O(1), which is subexponential in k for r in o((k/log k)^{1/3}).
- Low GF(2)-Rank Approximation: Matrix B is of GF(2)-rank at most r. This problem is known to be NP-complete already for r=1. It is also known to be W[1]-hard when parameterized by k. Interestingly, when parameterized by r and k, the problem is not only fixed-parameter tractable, but it is solvable in time 2^{O(r^{3/2}* sqrt{k log k})}(nm)^O(1), which is subexponential in k for r in o((k/log k)^{1/3}).
- Low Boolean-Rank Approximation: Matrix B is of Boolean rank at most r. The problem is known to be NP-complete for k=0 as well as for r=1. We show that it is solvable in subexponential in k time 2^{O(r2^r * sqrt{k log k})}(nm)^O(1).

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorial algorithms
  • Theory of computation → Fixed parameter tractability
Keywords
  • Binary matrices
  • clustering
  • low-rank approximation
  • fixed-parameter tractability

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Pankaj K. Agarwal, Sariel Har-Peled, and Kasturi R. Varadarajan. Approximating extent measures of points. J. ACM, 51(4):606-635, 2004. URL: http://dx.doi.org/10.1145/1008731.1008736.
  2. Alfred V. Aho, Jeffrey D. Ullman, and Mihalis Yannakakis. On notions of information transfer in VLSI circuits. In Proceedings of the 15th Annual ACM Symposium on Theory of Computing (STOC), pages 133-139. ACM, 1983. URL: http://dx.doi.org/10.1145/800061.808742.
  3. Noga Alon and Benny Sudakov. On two segmentation problems. J. Algorithms, 33(1):173-184, 1999. URL: http://dx.doi.org/10.1006/jagm.1999.1024.
  4. Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. J. ACM, 42(4):844-856, 1995. URL: http://dx.doi.org/10.1145/210332.210337.
  5. Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. Computing a nonnegative matrix factorization - provably. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing (STOC), pages 145-162. ACM, 2012. Google Scholar
  6. Mihai Badoiu, Sariel Har-Peled, and Piotr Indyk. Approximate clustering via core-sets. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 250-257. ACM, 2002. URL: http://dx.doi.org/10.1145/509907.509947.
  7. Eduard Bartl, Radim Belohlávek, and Jan Konecny. Optimal decompositions of matrices with grades into binary and graded matrices. Annals of Mathematics and Artificial Intelligence, 59(2):151-167, Jun 2010. URL: http://dx.doi.org/10.1007/s10472-010-9185-y.
  8. Amitabh Basu, Michael Dinitz, and Xin Li. Computing approximate PSD factorizations. CoRR, abs/1602.07351, 2016. URL: http://arxiv.org/abs/1602.07351,
  9. Radim Belohlávek and Vilém Vychodil. Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Computer and System Sciences, 76(1):3-20, 2010. URL: http://dx.doi.org/10.1016/j.jcss.2009.05.002.
  10. Karl Bringmann, Pavel Kolev, and David P. Woodruff. Approximation algorithms for 𝓁₀-low rank approximation. In Advances in Neural Information Processing Systems 30 (NIPS), pages 6651-6662, 2017. URL: http://papers.nips.cc/paper/7242-approximation-algorithms-for-ell_0-low-rank-approximation.
  11. L. Sunil Chandran, Davis Issac, and Andreas Karrenbauer. On the parameterized complexity of biclique cover and partition. In Proceedings of the 11th International Symposium on Parameterized and Exact Computation (IPEC), volume 63 of LIPIcs, pages 11:1-11:13. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.IPEC.2016.11.
  12. Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-ichi Amari. Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley &Sons, 2009. Google Scholar
  13. Rudi Cilibrasi, Leo van Iersel, Steven Kelk, and John Tromp. The complexity of the single individual SNP haplotyping problem. Algorithmica, 49(1):13-36, 2007. URL: http://dx.doi.org/10.1007/s00453-007-0029-z.
  14. Kenneth L. Clarkson and David P. Woodruff. Input sparsity and hardness for robust subspace approximation. In Proceedings of the 56th Annual Symposium on Foundations of Computer Science (FOCS), pages 310-329. IEEE Computer Society, 2015. Google Scholar
  15. Joel E Cohen and Uriel G Rothblum. Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra and its Applications, 190:149-168, 1993. Google Scholar
  16. Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-21275-3.
  17. Chen Dan, Kristoffer Arnsfelt Hansen, He Jiang, Liwei Wang, and Yuchen Zhou. On low rank approximation of binary matrices. CoRR, abs/1511.01699, 2015. URL: http://arxiv.org/abs/1511.01699.
  18. Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013. URL: http://dx.doi.org/10.1007/978-1-4471-5559-1.
  19. Pål Grønås Drange, Felix Reidl, Fernando Sanchez Villaamil, and Somnath Sikdar. Fast biclustering by dual parameterization. CoRR, abs/1507.08158, 2015. Google Scholar
  20. Uriel Feige. NP-hardness of hypercube 2-segmentation. CoRR, abs/1411.0821, 2014. URL: http://arxiv.org/abs/1411.0821.
  21. Samuel Fiorini, Serge Massar, Sebastian Pokutta, Hans Raj Tiwary, and Ronald de Wolf. Exponential lower bounds for polytopes in combinatorial optimization. J. ACM, 62(2):17, 2015. URL: http://dx.doi.org/10.1145/2716307.
  22. Fedor V. Fomin, Petr A. Golovach, and Fahad Panolan. Parameterized low-rank binary matrix approximation. CoRR, abs/1803.06102, 2018. Google Scholar
  23. Fedor V. Fomin, Stefan Kratsch, Marcin Pilipczuk, Michał Pilipczuk, and Yngve Villanger. Tight bounds for parameterized complexity of cluster editing with a small number of clusters. J. Computer and System Sciences, 80(7):1430-1447, 2014. Google Scholar
  24. Fedor V. Fomin, Daniel Lokshtanov, Syed Mohammad Meesum, Saket Saurabh, and Meirav Zehavi. Matrix rigidity from the viewpoint of parameterized complexity. In Proceedings of the 34th International Symposium on Theoretical Aspects of Computer Science (STACS), volume 66 of Leibniz International Proceedings in Informatics (LIPIcs), pages 32:1-32:14, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2017.32.
  25. Yun Fu. Low-Rank and Sparse Modeling for Visual Analysis. Springer International Publishing, 1 edition, 2014. Google Scholar
  26. Nicolas Gillis and Stephen A. Vavasis. On the complexity of robust PCA and 𝓁₁-norm low-rank matrix approximation. CoRR, abs/1509.09236, 2015. URL: http://arxiv.org/abs/1509.09236,
  27. Jens Gramm, Jiong Guo, Falk Hüffner, and Rolf Niedermeier. Data reduction and exact algorithms for clique cover. ACM Journal of Experimental Algorithmics, 13, 2008. URL: http://dx.doi.org/10.1145/1412228.1412236.
  28. David A. Gregory, Norman J. Pullman, Kathryn F. Jones, and J. Richard Lundgren. Biclique coverings of regular bigraphs and minimum semiring ranks of regular matrices. J. Combinatorial Theory Ser. B, 51(1):73-89, 1991. URL: http://dx.doi.org/10.1016/0095-8956(91)90006-6.
  29. Dmitry Grigoriev. Using the notions of separability and independence for proving the lower bounds on the circuit complexity (in russian). Notes of the Leningrad branch of the Steklov Mathematical Institute, Nauka, 1976. Google Scholar
  30. Dmitry Grigoriev. Using the notions of separability and independence for proving the lower bounds on the circuit complexity. Journal of Soviet Math., 14(5):1450-1456, 1980. Google Scholar
  31. Harold W. Gutch, Peter Gruber, Arie Yeredor, and Fabian J. Theis. ICA over finite fields - separability and algorithms. Signal Processing, 92(8):1796-1808, 2012. URL: http://dx.doi.org/10.1016/j.sigpro.2011.10.003.
  32. Alexander E. Guterman. Rank and determinant functions for matrices over semirings. In Surveys in contemporary mathematics, volume 347 of London Math. Soc. Lecture Note Ser., pages 1-33. Cambridge Univ. Press, Cambridge, 2008. Google Scholar
  33. Mary Inaba, Naoki Katoh, and Hiroshi Imai. Applications of weighted voronoi diagrams and randomization to variance-based k-clustering. In Proceedings of the 10th annual symposium on Computational Geometry, pages 332-339. ACM, 1994. Google Scholar
  34. Peng Jiang and Michael T. Heath. Mining discrete patterns via binary matrix factorization. In ICDM Workshops, pages 1129-1136. IEEE Computer Society, 2013. Google Scholar
  35. Peng Jiang, Jiming Peng, Michael Heath, and Rui Yang. A Clustering Approach to Constrained Binary Matrix Factorization, pages 281-303. Springer Berlin Heidelberg, Berlin, Heidelberg, 2014. Google Scholar
  36. Ravindran Kannan and Santosh Vempala. Spectral algorithms. Foundations and Trends in Theoretical Computer Science, 4(3-4):157-288, 2009. URL: http://dx.doi.org/10.1561/0400000025.
  37. Jon Kleinberg, Christos Papadimitriou, and Prabhakar Raghavan. Segmentation problems. J. ACM, 51(2):263-280, 2004. URL: http://dx.doi.org/10.1145/972639.972644.
  38. Mehmet Koyutürk and Ananth Grama. Proximus: A framework for analyzing very high dimensional discrete-attributed datasets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 147-156, New York, NY, USA, 2003. ACM. URL: http://dx.doi.org/10.1145/956750.956770.
  39. Amit Kumar, Yogish Sabharwal, and Sandeep Sen. Linear-time approximation schemes for clustering problems in any dimensions. J. ACM, 57(2):5:1-5:32, 2010. URL: http://dx.doi.org/10.1145/1667053.1667054.
  40. Daniel D Lee and H Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788-791, 1999. Google Scholar
  41. Satyanarayana V. Lokam. Complexity lower bounds using linear algebra. Found. Trends Theor. Comput. Sci., 4:1-155, 2009. Google Scholar
  42. László Lovász and Michael E. Saks. Lattices, Möbius functions and communication complexity. In Proceedings of the 29th Annual Symposium on Foundations of Computer Science (FOCS), pages 81-90. IEEE, 1988. Google Scholar
  43. Haibing Lu, Jaideep Vaidya, Vijayalakshmi Atluri, and Yuan Hong. Constraint-aware role mining via extended boolean matrix decomposition. IEEE Trans. Dependable Sec. Comput., 9(5):655-669, 2012. URL: http://dx.doi.org/10.1109/TDSC.2012.21.
  44. Michael W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2):123-224, 2011. URL: http://dx.doi.org/10.1561/2200000035.
  45. Dániel Marx. Closest substring problems with small distances. SIAM J. Comput., 38(4):1382-1410, 2008. URL: http://dx.doi.org/10.1137/060673898.
  46. S. M. Meesum, Pranabendu Misra, and Saket Saurabh. Reducing rank of the adjacency matrix by graph modification. Theoret. Comput. Sci., 654:70-79, 2016. URL: http://dx.doi.org/10.1016/j.tcs.2016.02.020.
  47. Syed Mohammad Meesum and Saket Saurabh. Rank reduction of directed graphs by vertex and edge deletions. In Proceedings of the 12th Latin American Symposium on (LATIN), volume 9644 of Lecture Notes in Comput. Sci., pages 619-633. Springer, 2016. URL: http://dx.doi.org/10.1007/978-3-662-49529-2_46.
  48. Pauli Miettinen, Taneli Mielikäinen, Aristides Gionis, Gautam Das, and Heikki Mannila. The discrete basis problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008. URL: http://dx.doi.org/10.1109/TKDE.2008.53.
  49. Pauli Miettinen and Jilles Vreeken. Model order selection for boolean matrix factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 51-59. ACM, 2011. URL: http://dx.doi.org/10.1145/2020408.2020424.
  50. Ankur Moitra. An almost optimal algorithm for computing nonnegative rank. SIAM J. Comput., 45(1):156-173, 2016. URL: http://dx.doi.org/10.1137/140990139.
  51. Ganesh R Naik. Non-negative Matrix Factorization Techniques. Springer, 2016. Google Scholar
  52. James Orlin. Contentment in graph theory: covering graphs with cliques. Nederl. Akad. Wetensch. Proc. Ser. A 80=Indag. Math., 39(5):406-424, 1977. Google Scholar
  53. Rafail Ostrovsky and Yuval Rabani. Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM, 49(2):139-156, 2002. URL: http://dx.doi.org/10.1145/506147.506149.
  54. Amichai Painsky, Saharon Rosset, and Meir Feder. Generalized independent component analysis over finite alphabets. IEEE Trans. Information Theory, 62(2):1038-1053, 2016. URL: http://dx.doi.org/10.1109/TIT.2015.2510657.
  55. A A Razborov. On rigid matrices. Manuscript in russian, 1989. Google Scholar
  56. Ilya P. Razenshteyn, Zhao Song, and David P. Woodruff. Weighted low rank approximations with provable guarantees. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC), pages 250-263. ACM, 2016. URL: http://dx.doi.org/10.1145/2897518.2897639.
  57. Bao-Hong Shen, Shuiwang Ji, and Jieping Ye. Mining discrete patterns via binary matrix factorization. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 757-766, New York, NY, USA, 2009. ACM. URL: http://dx.doi.org/10.1145/1557019.1557103.
  58. Leslie G. Valiant. Graph-theoretic arguments in low-level complexity. In Mathematical Foundations of Computer Science (MFCS), volume 53 of Lecture Notes in Comput. Sci., pages 162-176. Springer, 1977. Google Scholar
  59. David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1-2):1-157, 2014. URL: http://dx.doi.org/10.1561/0400000060.
  60. Sharon Wulff, Ruth Urner, and Shai Ben-David. Monochromatic bi-clustering. In Proceedings of the 30th International Conference on Machine Learning, (ICML), volume 28 of JMLR Workshop and Conference Proceedings, pages 145-153. JMLR.org, 2013. URL: http://jmlr.org/proceedings/papers/v28/.
  61. Mihalis Yannakakis. Expressing combinatorial optimization problems by linear programs. J. Comput. Syst. Sci., 43(3):441-466, 1991. URL: http://dx.doi.org/10.1016/0022-0000(91)90024-Y.
  62. Arie Yeredor. Independent component analysis over Galois fields of prime order. IEEE Trans. Information Theory, 57(8):5342-5359, 2011. URL: http://dx.doi.org/10.1109/TIT.2011.2145090.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail