Neven, Frank ;
Schwentick, Thomas ;
Spinrath, Christopher ;
Vandevoort, Brecht
ParallelCorrectness and ParallelBoundedness for Datalog Programs
Abstract
Recently, Ketsman et al. started the investigation of the parallel evaluation of recursive queries in the Massively Parallel Communication (MPC) model. Among other things, it was shown that parallelcorrectness and parallelboundedness for general Datalog programs is undecidable, by a reduction from the undecidable containment problem for Datalog. Furthermore, economic policies were introduced as a means to specify data distribution in a recursive setting. In this paper, we extend the latter framework to account for more general distributed evaluation strategies in terms of communication policies. We then show that the undecidability of parallelcorrectness runs deeper: it already holds for fragments of Datalog, e.g., monadic and frontierguarded Datalog, with a decidable containment problem, under relatively simple evaluation strategies. These simple evaluation strategies are defined w.r.t. datamoving distribution constraints. We then investigate restrictions of economic policies that yield decidability. In particular, we show that parallelcorrectness is 2EXPTIMEcomplete for monadic and frontierguarded Datalog under hashbased economic policies. Next, we consider restrictions of datamoving constraints and show that parallelcorrectness and parallelboundedness are 2EXPTIMEcomplete for frontierguarded Datalog. Interestingly, distributed evaluation no longer preserves the usual containment relationships between fragments of Datalog. Indeed, not every monadic Datalog program is equivalent to a frontierguarded one in the distributed setting. We illustrate the latter by considering two alternative settings where in one of these parallelcorrectness is decidable for frontierguarded Datalog but undecidable for monadic Datalog.
BibTeX  Entry
@InProceedings{neven_et_al:LIPIcs:2019:10316,
author = {Frank Neven and Thomas Schwentick and Christopher Spinrath and Brecht Vandevoort},
title = {{ParallelCorrectness and ParallelBoundedness for Datalog Programs}},
booktitle = {22nd International Conference on Database Theory (ICDT 2019)},
pages = {14:114:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771016},
ISSN = {18688969},
year = {2019},
volume = {127},
editor = {Pablo Barcelo and Marco Calautti},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2019/10316},
doi = {10.4230/LIPIcs.ICDT.2019.14},
annote = {Keywords: Datalog, distributed databases, distributed evaluation, decision problems, complexity}
}
2019
Keywords: 

Datalog, distributed databases, distributed evaluation, decision problems, complexity 
Seminar: 

22nd International Conference on Database Theory (ICDT 2019)

Issue date: 

2019 
Date of publication: 

2019 