Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported license
This paper studies the ``explanation problem'' for tree- and linearly-ordered array data, a problem motivated by database applications and recently solved for the one-dimensional tree-ordered case. In this paper, one is given a matrix A=(a_{ij}) whose rows and columns have semantics: special subsets of the rows and special subsets of the columns are meaningful, others are not. A submatrix in A is said to be meaningful if and only if it is the cross product of a meaningful row subset and a meaningful column subset, in which case we call it an ``allowed rectangle.'' The goal is to ``explain'' A as a sparse sum of weighted allowed rectangles. Specifically, we wish to find as few weighted allowed rectangles as possible such that, for all i,j, a_ij equals the sum of the weights of all rectangles which include cell (i,j).
In this paper we consider the natural cases in which the matrix dimensions are tree-ordered or linearly-ordered. In the tree-ordered case, we are given a rooted tree $T_1$ whose leaves are the rows of $A$ and another, $T_2$, whose leaves are the columns. Nodes of the trees correspond in an obvious way to the sets of their leaf descendants. In the linearly-ordered case, a set of rows or columns is meaningful if and only if it is contiguous.
For tree-ordered data, we prove the explanation problem NP-Hard and give a randomized $2$-approximation algorithm for it. For linearly-ordered data, we prove the explanation problem NP-Har and give a $2.56$-approximation algorithm. To our knowledge, these are the first results for the problem of sparsely and exactly representing matrices by weighted rectangles.
@InProceedings{karloff_et_al:LIPIcs.STACS.2011.332,
author = {Karloff, Howard and Korn, Flip and Makarychev, Konstantin and Rabani, Yuval},
title = {{On Parsimonious Explanations For 2-D Tree- and Linearly-Ordered Data}},
booktitle = {28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011)},
pages = {332--343},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-939897-25-5},
ISSN = {1868-8969},
year = {2011},
volume = {9},
editor = {Schwentick, Thomas and D\"{u}rr, Christoph},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2011.332},
URN = {urn:nbn:de:0030-drops-30246},
doi = {10.4230/LIPIcs.STACS.2011.332},
annote = {Keywords: ordered data, explanation problem}
}