Found 2 Possible Name Variants:

Document

Invited Talk

**Published in:** LIPIcs, Volume 220, 25th International Conference on Database Theory (ICDT 2022)

This article is a companion to an invited talk at ICDT'2022 with the same title.
Cardinality estimation is among the most important problems in query optimization. It is well-documented that, when query plans go haywire, in most cases one can trace the root cause to the cardinality estimator being far off. In particular, traditional cardinality estimation based on selectivity estimation may sometimes under-estimate cardinalities by orders of magnitudes, because the independence or the uniformity assumptions do not typically hold.
This talk outlines an approach to cardinality estimation that is "model-free" from a statistical stand-point. Being model-free means the approach tries to avoid making any distributional assumptions. Our approach is information-theoretic, and generalizes recent results on worst-case output size bounds of queries, allowing the estimator to take into account histogram information from the input relations. The estimator turns out to be the objective of a maximization problem subject to concave constraints, over an exponential number of variables. We then explain how the estimator can be computed in polynomial time for some fragment of these constraints. Overall, the talk introduces a new direction to address the classic problem of cardinality estimation that is designed to circumvent some of the pitfalls of selectivity-based estimation. We will also explain connections to other fundamental problems in information theory and database theory regarding information inequalities.
The talk is based on (published and unpublished) joint works with Mahmoud Abo Khamis, Sungjin Im, Hossein Keshavarz, Phokion Kolaitis, Ben Moseley, XuanLong Nguyen, Kirk Pruhs, Dan Suciu, and Alireza Samadian Zakaria

Hung Q. Ngo. On an Information Theoretic Approach to Cardinality Estimation (Invited Talk). In 25th International Conference on Database Theory (ICDT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 220, pp. 1:1-1:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{ngo:LIPIcs.ICDT.2022.1, author = {Ngo, Hung Q.}, title = {{On an Information Theoretic Approach to Cardinality Estimation}}, booktitle = {25th International Conference on Database Theory (ICDT 2022)}, pages = {1:1--1:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-223-5}, ISSN = {1868-8969}, year = {2022}, volume = {220}, editor = {Olteanu, Dan and Vortmeier, Nils}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2022.1}, URN = {urn:nbn:de:0030-drops-158750}, doi = {10.4230/LIPIcs.ICDT.2022.1}, annote = {Keywords: Cardinality Estimation, Information Theory, Polymatroid Bound, Worst-case Optimal Join} }

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Constraints on entropies are considered to be the laws of information theory. Even though the pursuit of their discovery has been a central theme of research in information theory, the algorithmic aspects of constraints on entropies remain largely unexplored. Here, we initiate an investigation of decision problems about constraints on entropies by placing several different such problems into levels of the arithmetical hierarchy. We establish the following results on checking the validity over all almost-entropic functions: first, validity of a Boolean information constraint arising from a monotone Boolean formula is co-recursively enumerable; second, validity of "tight" conditional information constraints is in Π⁰₃. Furthermore, under some restrictions, validity of conditional information constraints "with slack" is in Σ⁰₂, and validity of information inequality constraints involving max is Turing equivalent to validity of information inequality constraints (with no max involved). We also prove that the classical implication problem for conditional independence statements is co-recursively enumerable.

Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, and Dan Suciu. Decision Problems in Information Theory. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 106:1-106:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.ICALP.2020.106, author = {Abo Khamis, Mahmoud and Kolaitis, Phokion G. and Ngo, Hung Q. and Suciu, Dan}, title = {{Decision Problems in Information Theory}}, booktitle = {47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)}, pages = {106:1--106:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-138-2}, ISSN = {1868-8969}, year = {2020}, volume = {168}, editor = {Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.106}, URN = {urn:nbn:de:0030-drops-125137}, doi = {10.4230/LIPIcs.ICALP.2020.106}, annote = {Keywords: Information theory, decision problems, arithmetical hierarchy, entropic functions} }

Document

**Published in:** LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

We consider the problem of incrementally maintaining the triangle count query under single-tuple updates to the input relations. We introduce an approach that exhibits a space-time tradeoff such that the space-time product is quadratic in the size of the input database and the update time can be as low as the square root of this size. This lowest update time is worst-case optimal conditioned on the Online Matrix-Vector Multiplication conjecture.
The classical and factorized incremental view maintenance approaches are recovered as special cases of our approach within the space-time tradeoff. In particular, they require linear-time maintenance under updates, which is suboptimal. Our approach can also count all triangles in a static database in the worst-case optimal time needed for enumerating them.

Ahmet Kara, Hung Q. Ngo, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. Counting Triangles under Updates in Worst-Case Optimal Time. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 4:1-4:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{kara_et_al:LIPIcs.ICDT.2019.4, author = {Kara, Ahmet and Ngo, Hung Q. and Nikolic, Milos and Olteanu, Dan and Zhang, Haozhe}, title = {{Counting Triangles under Updates in Worst-Case Optimal Time}}, booktitle = {22nd International Conference on Database Theory (ICDT 2019)}, pages = {4:1--4:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-101-6}, ISSN = {1868-8969}, year = {2019}, volume = {127}, editor = {Barcelo, Pablo and Calautti, Marco}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.4}, URN = {urn:nbn:de:0030-drops-103068}, doi = {10.4230/LIPIcs.ICDT.2019.4}, annote = {Keywords: incremental view maintenance, amortized analysis, data skew} }

Document

**Published in:** LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

We propose an approach for answering conjunctive queries with negation, where the negated relations have bounded degree. Its data complexity matches that of the InsideOut and PANDA algorithms for the positive subquery of the input query and is expressed in terms of the fractional hypertree width and the submodular width respectively. Its query complexity depends on the structure of the conjunction of negated relations; in general it is exponential in the number of join variables occurring in negated relations yet it becomes polynomial for several classes of queries.
This approach relies on several contributions. We show how to rewrite queries with negation on bounded-degree relations into equivalent conjunctive queries with not-all-equal (NAE) predicates, which are a multi-dimensional analog of disequality (!=). We then generalize the known color-coding technique to conjunctions of NAE predicates and explain it via a Boolean tensor decomposition of conjunctions of NAE predicates. This decomposition can be achieved via a probabilistic construction that can be derandomized efficiently.

Mahmoud Abo Khamis, Hung Q. Ngo, Dan Olteanu, and Dan Suciu. Boolean Tensor Decomposition for Conjunctive Queries with Negation. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 21:1-21:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.ICDT.2019.21, author = {Abo Khamis, Mahmoud and Ngo, Hung Q. and Olteanu, Dan and Suciu, Dan}, title = {{Boolean Tensor Decomposition for Conjunctive Queries with Negation}}, booktitle = {22nd International Conference on Database Theory (ICDT 2019)}, pages = {21:1--21:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-101-6}, ISSN = {1868-8969}, year = {2019}, volume = {127}, editor = {Barcelo, Pablo and Calautti, Marco}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.21}, URN = {urn:nbn:de:0030-drops-103236}, doi = {10.4230/LIPIcs.ICDT.2019.21}, annote = {Keywords: color-coding, combined complexity, negation, query evaluation} }

Document

**Published in:** LIPIcs, Volume 14, 29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012)

We present two recursive techniques to construct compressed sensing schemes that can be "decoded" in sub-linear time. The first technique is based on the well studied code composition method called code concatenation where the "outer" code has strong list recoverability properties. This technique uses only one level of recursion and critically uses the power of list recovery. The second recursive technique is conceptually similar, and has multiple recursion levels. The following compressed sensing results are obtained using these techniques:
- Strongly explicit efficiently decodable l_1/l_1 compressed sensing matrices: We present a strongly explicit ("for all") compressed sensing measurement matrix with O(d^2log^2 n) measurements that can output near-optimal d-sparse approximations in time poly(d log n).
- Near-optimal efficiently decodable l_1/l_1 compressed sensing matrices for non-negative signals: We present two randomized constructions of ("for all") compressed sensing matrices with near optimal number of measurements: O(d log n loglog_d n) and O_{m,s}(d^{1+1/s} log n (log^(m) n)^s), respectively, for any integer parameters s,m>=1. Both of these constructions can output near optimal d-sparse approximations for non-negative signals in time poly(d log n).
To the best of our knowledge, none of the results are dominated by existing results in the literature.

Hung Q. Ngo, Ely Porat, and Atri Rudra. Efficiently Decodable Compressed Sensing by List-Recoverable Codes and Recursion. In 29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012). Leibniz International Proceedings in Informatics (LIPIcs), Volume 14, pp. 230-241, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InProceedings{ngo_et_al:LIPIcs.STACS.2012.230, author = {Ngo, Hung Q. and Porat, Ely and Rudra, Atri}, title = {{Efficiently Decodable Compressed Sensing by List-Recoverable Codes and Recursion}}, booktitle = {29th International Symposium on Theoretical Aspects of Computer Science (STACS 2012)}, pages = {230--241}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-35-4}, ISSN = {1868-8969}, year = {2012}, volume = {14}, editor = {D\"{u}rr, Christoph and Wilke, Thomas}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2012.230}, URN = {urn:nbn:de:0030-drops-34011}, doi = {10.4230/LIPIcs.STACS.2012.230}, annote = {Keywords: Compressed Sensing, Sub-Linear Time Decoding, List-Recoverable Codes} }

Document

**Published in:** LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

Datalog^∘ is an extension of Datalog, where instead of a program being a collection of union of conjunctive queries over the standard Boolean semiring, a program may now be a collection of sum-product queries over an arbitrary commutative partially ordered pre-semiring. Datalog^∘ is more powerful than Datalog in that its additional algebraic structure alows for supporting recursion with aggregation. At the same time, Datalog^∘ retains the syntactic and semantic simplicity of Datalog: Datalog^∘ has declarative least fixpoint semantics. The least fixpoint can be found via the naïve evaluation algorithm that repeatedly applies the immediate consequence operator until no further change is possible.
It was shown in [Mahmoud Abo Khamis et al., 2022] that, when the underlying semiring is p-stable, then the naïve evaluation of any Datalog^∘ program over the semiring converges in a finite number of steps. However, the upper bounds on the rate of convergence were exponential in the number n of ground IDB atoms.
This paper establishes polynomial upper bounds on the convergence rate of the naïve algorithm on linear Datalog^∘ programs, which is quite common in practice. In particular, the main result of this paper is that the convergence rate of linear Datalog^∘ programs under any p-stable semiring is O(pn³). Furthermore, we show a matching lower bound by constructing a p-stable semiring and a linear Datalog^∘ program that requires Ω(pn³) iterations for the naïve iteration algorithm to converge. Next, we study the convergence rate in terms of the number of elements in the semiring for linear Datalog^∘ programs. When L is the number of elements, the convergence rate is bounded by O(pn log L). This significantly improves the convergence rate for small L. We show a nearly matching lower bound as well.

Sungjin Im, Benjamin Moseley, Hung Ngo, and Kirk Pruhs. On the Convergence Rate of Linear Datalog ^∘ over Stable Semirings. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 11:1-11:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{im_et_al:LIPIcs.ICDT.2024.11, author = {Im, Sungjin and Moseley, Benjamin and Ngo, Hung and Pruhs, Kirk}, title = {{On the Convergence Rate of Linear Datalog ^∘ over Stable Semirings}}, booktitle = {27th International Conference on Database Theory (ICDT 2024)}, pages = {11:1--11:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-312-6}, ISSN = {1868-8969}, year = {2024}, volume = {290}, editor = {Cormode, Graham and Shekelyan, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2024.11}, URN = {urn:nbn:de:0030-drops-197939}, doi = {10.4230/LIPIcs.ICDT.2024.11}, annote = {Keywords: Datalog, convergence rate, semiring} }

Document

**Published in:** LIPIcs, Volume 202, 46th International Symposium on Mathematical Foundations of Computer Science (MFCS 2021)

We consider the Matrix Tree Multiplication problem. This problem is a generalization of the classic Matrix Chain Multiplication problem covered in the dynamic programming chapter of many introductory algorithms textbooks. An instance of the Matrix Tree Multiplication problem consists of a rooted tree with a matrix associated with each edge. The output is, for each leaf in the tree, the product of the matrices on the chain/path from the root to that leaf. Matrix multiplications that are shared between various chains need only be computed once, potentially being shared between different root to leaf chains. Algorithms are evaluated by the number of scalar multiplications performed. Our main result is a linear time algorithm for which the number of scalar multiplications performed is at most 15 times the optimal number of scalar multiplications.

Mahmoud Abo-Khamis, Ryan Curtin, Sungjin Im, Benjamin Moseley, Hung Ngo, Kirk Pruhs, and Alireza Samadian. An Approximation Algorithm for the Matrix Tree Multiplication Problem. In 46th International Symposium on Mathematical Foundations of Computer Science (MFCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 202, pp. 6:1-6:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.MFCS.2021.6, author = {Abo-Khamis, Mahmoud and Curtin, Ryan and Im, Sungjin and Moseley, Benjamin and Ngo, Hung and Pruhs, Kirk and Samadian, Alireza}, title = {{An Approximation Algorithm for the Matrix Tree Multiplication Problem}}, booktitle = {46th International Symposium on Mathematical Foundations of Computer Science (MFCS 2021)}, pages = {6:1--6:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-201-3}, ISSN = {1868-8969}, year = {2021}, volume = {202}, editor = {Bonchi, Filippo and Puglisi, Simon J.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2021.6}, URN = {urn:nbn:de:0030-drops-144464}, doi = {10.4230/LIPIcs.MFCS.2021.6}, annote = {Keywords: Matrix Multiplication, Approximation Algorithm} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail