Abstract
This paper studies the ``explanation problem'' for tree and linearlyordered array data, a problem motivated by database applications and recently solved for the onedimensional treeordered case. In this paper, one is given a matrix A=(a_{ij}) whose rows and columns have semantics: special subsets of the rows and special subsets of the columns are meaningful, others are not. A submatrix in A is said to be meaningful if and only if it is the cross product of a meaningful row subset and a meaningful column subset, in which case we call it an ``allowed rectangle.'' The goal is to ``explain'' A as a sparse sum of weighted allowed rectangles. Specifically, we wish to find as few weighted allowed rectangles as possible such that, for all i,j, a_ij equals the sum of the weights of all rectangles which include cell (i,j).
In this paper we consider the natural cases in which the matrix dimensions are treeordered or linearlyordered. In the treeordered case, we are given a rooted tree $T_1$ whose leaves are the rows of $A$ and another, $T_2$, whose leaves are the columns. Nodes of the trees correspond in an obvious way to the sets of their leaf descendants. In the linearlyordered case, a set of rows or columns is meaningful if and only if it is contiguous.
For treeordered data, we prove the explanation problem NPHard and give a randomized $2$approximation algorithm for it. For linearlyordered data, we prove the explanation problem NPHar and give a $2.56$approximation algorithm. To our knowledge, these are the first results for the problem of sparsely and exactly representing matrices by weighted rectangles.
BibTeX  Entry
@InProceedings{karloff_et_al:LIPIcs:2011:3024,
author = {Howard Karloff and Flip Korn and Konstantin Makarychev and Yuval Rabani},
title = {{On Parsimonious Explanations For 2D Tree and LinearlyOrdered Data}},
booktitle = {28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011) },
pages = {332343},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783939897255},
ISSN = {18688969},
year = {2011},
volume = {9},
editor = {Thomas Schwentick and Christoph D{\"u}rr},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2011/3024},
URN = {urn:nbn:de:0030drops30246},
doi = {10.4230/LIPIcs.STACS.2011.332},
annote = {Keywords: ordered data, explanation problem}
}
Keywords: 

ordered data, explanation problem 
Collection: 

28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011) 
Issue Date: 

2011 
Date of publication: 

11.03.2011 