Low-Rank Binary Matrix Approximation in Column-Sum Norm

Authors Fedor V. Fomin , Petr A. Golovach , Fahad Panolan , Kirill Simonov



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2020.32.pdf
  • Filesize: 0.68 MB
  • 18 pages

Document Identifiers

Author Details

Fedor V. Fomin
  • Department of Informatics, University of Bergen, Norway
Petr A. Golovach
  • Department of Informatics, University of Bergen, Norway
Fahad Panolan
  • Department of Computer Science and Engineering, IIT Hyderabad, India
Kirill Simonov
  • Department of Informatics, University of Bergen, Norway

Cite As Get BibTex

Fedor V. Fomin, Petr A. Golovach, Fahad Panolan, and Kirill Simonov. Low-Rank Binary Matrix Approximation in Column-Sum Norm. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 32:1-32:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2020.32

Abstract

We consider 𝓁₁-Rank-r Approximation over {GF}(2), where for a binary m× n matrix 𝐀 and a positive integer constant r, one seeks a binary matrix 𝐁 of rank at most r, minimizing the column-sum norm ‖ 𝐀 -𝐁‖₁. We show that for every ε ∈ (0, 1), there is a {randomized} (1+ε)-approximation algorithm for 𝓁₁-Rank-r Approximation over {GF}(2) of running time m^{O(1)}n^{O(2^{4r}⋅ ε^{-4})}. This is the first polynomial time approximation scheme (PTAS) for this problem.

Subject Classification

ACM Subject Classification
  • Theory of computation → Approximation algorithms analysis
Keywords
  • Binary Matrix Factorization
  • PTAS
  • Column-sum norm

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Frank Ban, Vijay Bhattiprolu, Karl Bringmann, Pavel Kolev, Euiwoong Lee, and David P. Woodruff. A PTAS for lp-low rank approximation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 747-766. SIAM, 2019. URL: https://doi.org/10.1137/1.9781611975482.47.
  2. Eduard Bartl, Radim Belohlávek, and Jan Konecny. Optimal decompositions of matrices with grades into binary and graded matrices. Annals of Mathematics and Artificial Intelligence, 59(2):151-167, June 2010. URL: https://doi.org/10.1007/s10472-010-9185-y.
  3. Radim Belohlávek and Vilém Vychodil. Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Computer and System Sciences, 76(1):3-20, 2010. URL: https://doi.org/10.1016/j.jcss.2009.05.002.
  4. Michael W Berry, Susan T Dumais, and Gavin W O'Brien. Using linear algebra for intelligent information retrieval. SIAM review, 37(4):573-595, 1995. Google Scholar
  5. Avrim Blum, John Hopcroft, and Ravi Kannan. Foundations of Data Science. Cambridge University Press, 2020. URL: https://books.google.co.in/books?id=koHCDwAAQBAJ.
  6. Marek Cygan, Daniel Lokshtanov, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Lower Bounds for Approximation Schemes for Closest String. In Proceedings of the 15th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT), volume 53 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1-12:10, Dagstuhl, Germany, 2016. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.SWAT.2016.12.
  7. Chen Dan, Kristoffer Arnsfelt Hansen, He Jiang, Liwei Wang, and Yuchen Zhou. On low rank approximation of binary matrices. CoRR, abs/1511.01699, 2015. URL: http://arxiv.org/abs/1511.01699.
  8. Carl Eckart and Gale Young. The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211-218, 1936. Google Scholar
  9. Fedor V. Fomin, Petr A. Golovach, Daniel Lokshtanov, Fahad Panolan, and Saket Saurabh. Approximation schemes for low-rank binary matrix approximation problems. CoRR, abs/1807.07156, 2018. URL: http://arxiv.org/abs/1807.07156.
  10. Fedor V. Fomin, Petr A. Golovach, and Fahad Panolan. Parameterized low-rank binary matrix approximation. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP), volume 107 of LIPIcs, pages 53:1-53:16. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018. URL: https://doi.org/10.4230/LIPIcs.ICALP.2018.53.
  11. Yinghua Fu, Nianping Jiang, and Hong Sun. Binary matrix factorization and consensus algorithms. In Proceedings of the International Conference on Electrical and Control Engineering (ICECE), pages 4563-4567. IEEE, 2010. Google Scholar
  12. Leszek Gasieniec, Jesper Jansson, and Andrzej Lingas. Approximation algorithms for hamming clustering problems. J. Discrete Algorithms, 2(2):289-301, 2004. URL: https://doi.org/10.1016/S1570-8667(03)00079-0.
  13. Nicolas Gillis and Stephen A. Vavasis. On the complexity of robust PCA and 𝓁₁-norm low-rank matrix approximation. CoRR, abs/1509.09236, 2015. URL: http://arxiv.org/abs/1509.09236.
  14. Harold W. Gutch, Peter Gruber, Arie Yeredor, and Fabian J. Theis. ICA over finite fields - separability and algorithms. Signal Processing, 92(8):1796-1808, 2012. URL: https://doi.org/10.1016/j.sigpro.2011.10.003.
  15. Harold Hotelling. Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6):417, 1933. Google Scholar
  16. Peng Jiang and Michael T. Heath. Mining discrete patterns via binary matrix factorization. In ICDM Workshops, pages 1129-1136. IEEE Computer Society, 2013. Google Scholar
  17. Peng Jiang, Jiming Peng, Michael Heath, and Rui Yang. A clustering approach to constrained binary matrix factorization. In Data Mining and Knowledge Discovery for Big Data: Methodologies, Challenge and Opportunities, pages 281-303. Springer Berlin Heidelberg, Berlin, Heidelberg, 2014. Google Scholar
  18. Yishan Jiao, Jingyi Xu, and Ming Li. On the k-closest substring and k-consensus pattern problems. In Proceedings of the 15th Annual Symposium on Combinatorial Pattern (CPM), volume 3109 of Lecture Notes in Comput. Sci., pages 130-144. Springer, 2004. Google Scholar
  19. Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604-632, 1999. URL: https://doi.org/10.1145/324133.324140.
  20. Mehmet Koyutürk and Ananth Grama. Proximus: A framework for analyzing very high dimensional discrete-attributed datasets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 147-156, New York, NY, USA, 2003. ACM. URL: https://doi.org/10.1145/956750.956770.
  21. Ming Li, Bin Ma, and Lusheng Wang. On the closest string and substring problems. J. ACM, 49(2):157-171, 2002. URL: https://doi.org/10.1145/506147.506150.
  22. Haibing Lu, Jaideep Vaidya, Vijayalakshmi Atluri, and Yuan Hong. Constraint-aware role mining via extended boolean matrix decomposition. IEEE Trans. Dependable Sec. Comput., 9(5):655-669, 2012. URL: https://doi.org/10.1109/TDSC.2012.21.
  23. Bin Ma and Xiaoming Sun. More efficient algorithms for closest string and substring problems. SIAM J. Computing, 39(4):1432-1443, 2009. URL: https://doi.org/10.1137/080739069.
  24. Michael W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2):123-224, 2011. URL: https://doi.org/10.1561/2200000035.
  25. Pauli Miettinen, Taneli Mielikäinen, Aristides Gionis, Gautam Das, and Heikki Mannila. The discrete basis problem. IEEE Trans. Knowl. Data Eng., 20(10):1348-1362, 2008. URL: https://doi.org/10.1109/TKDE.2008.53.
  26. Pauli Miettinen and Jilles Vreeken. Model order selection for boolean matrix factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 51-59. ACM, 2011. URL: https://doi.org/10.1145/2020408.2020424.
  27. L. Mirsky. Symmetric gauge functions and unitarily invariant norms. Quart. J. Math. Oxford Ser. (2), 11:50-59, 1960. URL: https://doi.org/10.1093/qmath/11.1.50.
  28. Barsha Mitra, Shamik Sural, Jaideep Vaidya, and Vijayalakshmi Atluri. A survey of role mining. ACM Comput. Surv., 48(4):50:1-50:37, February 2016. URL: https://doi.org/10.1145/2871148.
  29. Rafail Ostrovsky and Yuval Rabani. Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM, 49(2):139-156, 2002. URL: https://doi.org/10.1145/506147.506149.
  30. Amichai Painsky, Saharon Rosset, and Meir Feder. Generalized independent component analysis over finite alphabets. IEEE Trans. Information Theory, 62(2):1038-1053, 2016. URL: https://doi.org/10.1109/TIT.2015.2510657.
  31. Karl Pearson. Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559-572, 1901. Google Scholar
  32. Bao-Hong Shen, Shuiwang Ji, and Jieping Ye. Mining discrete patterns via binary matrix factorization. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 757-766, New York, NY, USA, 2009. ACM. URL: https://doi.org/10.1145/1557019.1557103.
  33. Zhao Song, David P. Woodruff, and Peilin Zhong. Low rank approximation with entrywise 𝓁₁-norm error. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 688-701. ACM, 2017. URL: https://doi.org/10.1145/3055399.3055431.
  34. Jaideep Vaidya. Boolean matrix decomposition problem: Theory, variations and applications to data engineering. In Proceedings of the 28th IEEE International Conference on Data Engineering (ICDE), pages 1222-1224. IEEE Computer Society, 2012. URL: https://doi.org/10.1109/ICDE.2012.144.
  35. Jaideep Vaidya, Vijayalakshmi Atluri, and Qi Guo. The role mining problem: finding a minimal descriptive set of roles. In Proceedings of the 12th ACM Symposium on Access Control Models and (SACMAT), pages 175-184, 2007. URL: https://doi.org/10.1145/1266840.1266870.
  36. David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science, 10(1-2):1-157, 2014. URL: https://doi.org/10.1561/0400000060.
  37. Arie Yeredor. Independent component analysis over Galois fields of prime order. IEEE Trans. Information Theory, 57(8):5342-5359, 2011. URL: https://doi.org/10.1109/TIT.2011.2145090.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail