Document

# Binary Matrix Completion Under Diameter Constraints

## File

LIPIcs.STACS.2021.47.pdf
• Filesize: 0.78 MB
• 14 pages

## Acknowledgements

We are grateful to Christian Komusiewicz for helpful feedback on an earlier version of this work and to Stefan Szeider for pointing us to the work on clustering incomplete data [Eduard Eiben et al., 2019]. We also thank Curtis Bright for mentioning the connection to the Hadamard matrix completion problem.

## Cite As

Tomohiro Koana, Vincent Froese, and Rolf Niedermeier. Binary Matrix Completion Under Diameter Constraints. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 47:1-47:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.STACS.2021.47

## Abstract

We thoroughly study a novel but basic combinatorial matrix completion problem: Given a binary incomplete matrix, fill in the missing entries so that the resulting matrix has a specified maximum diameter (that is, upper-bounding the maximum Hamming distance between any two rows of the completed matrix) as well as a specified minimum Hamming distance between any two of the matrix rows. This scenario is closely related to consensus string problems as well as to recently studied clustering problems on incomplete data. We obtain an almost complete picture concerning the complexity landscape (P vs NP) regarding the diameter constraints and regarding the number of missing entries per row of the incomplete matrix. We develop polynomial-time algorithms for maximum diameter three, which are based on Deza’s theorem [Discret. Math. 1973, J. Comb. Theory, Ser. B 1974] from extremal set theory. In this way, we also provide one of the rare links between sunflower techniques and stringology. On the negative side, we prove NP-hardness for diameter at least four. For the number of missing entries per row, we show polynomial-time solvability when there is only one missing entry and NP-hardness when there can be at least two missing entries. In general, our algorithms heavily rely on Deza’s theorem and the correspondingly identified sunflower structures pave the way towards solutions based on computing graph factors and solving 2-SAT instances.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Design and analysis of algorithms
• Mathematics of computing → Discrete mathematics
##### Keywords
• sunflowers
• binary matrices
• Hamming distance
• stringology
• consensus problems
• complexity dichotomy
• combinatorial algorithms
• graph factors
• 2-Sat

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Vineet Bafna, Sorin Istrail, Giuseppe Lancia, and Romeo Rizzi. Polynomial and APX-hard cases of the individual haplotyping problem. Theoretical Computer Science, 335(1):109-125, 2005.
2. Piotr Berman, Marek Karpinski, and Alex D. Scott. Approximation hardness of short symmetric instances of MAX-3SAT. Electronic Colloquium on Computational Complexity (ECCC), 049, 2003.
3. Karl Bringmann and Marvin Künnemann. Multivariate fine-grained complexity of longest common subsequence. In 29th Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA '18), pages 1216-1235, 2018.
4. Laurent Bulteau, Vincent Froese, and Rolf Niedermeier. Tight hardness results for consensus problems on circular strings and time series. SIAM Journal on Discrete Mathematics, 34(3):1854-1883, 2020.
5. Laurent Bulteau, Falk Hüffner, Christian Komusiewicz, and Rolf Niedermeier. Multivariate algorithmics for NP-hard string problems. Bulletin of the EATCS, 114, 2014.
6. Laurent Bulteau and Markus L. Schmid. Consensus strings with small maximum distance and small distance sum. Algorithmica, 82(5):1378-1409, 2020.
7. Michel Deza. Une propriété extrémale des plans projectifs finis dans une classe de codes équidistants. Discrete Mathematics, 6(4):343-352, 1973.
8. Michel Deza. Solution d'un problème de Erdős-Lovász. Journal of Combinatorial Theory, Series B, 16(4):166-167, 1974.
9. Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. On clustering incomplete data. CoRR, abs/1911.01465, 2019. URL: http://arxiv.org/abs/1911.01465.
10. Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006.
11. Vincent Froese, René van Bevern, Rolf Niedermeier, and Manuel Sorge. Exploiting hidden structure in selecting dimensions that distinguish vectors. Journal of Computer and System Sciences, 82(3):521-535, 2016.
12. Harold N. Gabow. An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems. In 15th Annual ACM Symposium on Theory of Computing, (STOC '83), pages 448-456, 1983.
13. Robert Ganian, Iyad Kanj, Sebastian Ordyniak, and Stefan Szeider. On the parameterized complexity of clustering incomplete data into subspaces of small rank. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, (AAAI '20), pages 3906-3913, 2020.
14. Robert Ganian, Iyad A. Kanj, Sebastian Ordyniak, and Stefan Szeider. Parameterized algorithms for the matrix completion problem. In 35th International Conference on Machine Learning, (ICML '18), volume 80 of Proceedings of Machine Learning Research, pages 1642-1651. PMLR, 2018.
15. Leszek Gasieniec, Jesper Jansson, and Andrzej Lingas. Approximation algorithms for Hamming clustering problems. Journal of Discrete Algorithms, 2(2):289-301, 2004.
16. Jens Gramm, Rolf Niedermeier, and Peter Rossmanith. Fixed-parameter algorithms for Closest String and related problems. Algorithmica, 37(1):25-42, 2003.
17. Danny Hermelin and Liat Rozenberg. Parameterized complexity analysis for the closest string with wildcards problem. Theoretical Computer Science, 600:11-18, 2015.
18. Charles R Johnson. Matrix completion problems: a survey. In Matrix Theory and Applications, volume 40 of Proceedings of Symposia in Applied Mathematics, pages 171-198. American Mathematical Society, 1990.
19. Stasys Jukna. Extremal Combinatorics: With Applications in Computer Science. Springer Science & Business Media, 2011.
20. Tomohiro Koana, Vincent Froese, and Rolf Niedermeier. Parameterized algorithms for matrix completion with radius constraints. In 31st Annual Symposium on Combinatorial Pattern Matching, (CPM '20), pages 20:1-20:14, 2020.
21. Ross Lippert, Russell Schwartz, Giuseppe Lancia, and Sorin Istrail. Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Briefings in Bioinformatics, 3(1):23-31, 2002.
22. Cristopher Moore and J. M. Robson. Hard tiling problems with simple tiles. Discrete & Computational Geometry, 26(4):573-590, 2001.
23. Markus L. Schmid. Finding consensus strings with small length difference between input and solution strings. ACM Transactions on Computation Theory, 9(3):13:1-13:18, 2017.
24. Ryan Williams. A new algorithm for optimal 2-constraint satisfaction and its implications. Theoretical Computer Science, 348(2-3):357-365, 2005.
X

Feedback for Dagstuhl Publishing