Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data

Authors Bas Ketsman, Frank Neven



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2015.291.pdf
  • Filesize: 0.53 MB
  • 17 pages

Document Identifiers

Author Details

Bas Ketsman
Frank Neven

Cite As Get BibTex

Bas Ketsman and Frank Neven. Optimal Broadcasting Strategies for Conjunctive Queries over Distributed Data. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 291-307, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015) https://doi.org/10.4230/LIPIcs.ICDT.2015.291

Abstract

In a distributed context where data is dispersed over many computing nodes, monotone queries can be evaluated in an eventually consistent and coordination-free manner through a simple but naive broadcasting strategy which makes all data available on every computing node. In this paper, we investigate more economical broadcasting strategies for full conjunctive queries without self-joins that only transmit a part of the local data necessary to evaluate the query at hand. We consider oblivious broadcasting strategies which determine which local facts to broadcast independent of the data at other computing nodes. We introduce the notion of broadcast dependency set (BDS) as a sound and complete formalism to represent locally optimal oblivious broadcasting functions. We provide algorithms to construct a BDS for a given conjunctive query and study the complexity of various decision problems related to these algorithms.

Subject Classification

Keywords
  • Coordination-free evaluation
  • conjunctive queries
  • broadcasting

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. F. N. Afrati and J. D. Ullman. Optimizing joins in a map-reduce environment. In International Conference on Extending Database Technology (EDBT), pages 99-110, 2010. Google Scholar
  2. Foto N. Afrati, Paraschos Koutris, Dan Suciu, and Jeffrey D. Ullman. Parallel skyline queries. In International Conference on Database Theory (ICDT), pages 274-284, 2012. Google Scholar
  3. Peter Alvaro, Neil Conway, Joe Hellerstein, and William R. Marczak. Consistency analysis in bloom: a CALM and collected approach. In Conference on Innovative Data Systems Research (CIDR), pages 249-260, 2011. Google Scholar
  4. Peter Alvaro, Neil Conway, Joseph M. Hellerstein, and David Maier. Blazes: Coordination analysis for distributed programs. In International Conference on Data Engineering (ICDE), pages 52-63. IEEE, 2014. Google Scholar
  5. Tom J. Ameloot, Bas Ketsman, Frank Neven, and Daniel Zinn. Weaker forms of monotonicity for declarative networking: a more fine-grained answer to the CALM-conjecture. In Symposium on Principles of Database Systems (PODS), pages 64-75. ACM, 2014. Google Scholar
  6. Tom J. Ameloot, Frank Neven, and Jan Van den Bussche. Relational transducers for declarative networking. Journal of the ACM, 60(2):15, 2013. Google Scholar
  7. Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. In Symposium on Principles of Database Systems (PODS), pages 273-284, 2013. Google Scholar
  8. Paul Beame, Paraschos Koutris, and Dan Suciu. Skew in parallel query processing. In Symposium on Principles of Database Systems (PODS), pages 212-223, 2014. Google Scholar
  9. Peter Buneman, James Cheney, Wang Chiew Tan, and Stijn Vansummeren. Curated databases. In Symposium on Principles of Database Systems (PODS), pages 1-12. ACM, 2008. Google Scholar
  10. Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. Why and where: A characterization of data provenance. In International Conference on Database Theory (ICDT), volume 1973 of Lecture Notes in Computer Science, pages 316-330. Springer, 2001. Google Scholar
  11. Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier. Logic and lattices for distributed programming. In Symposium on Cloud Computing (SoCC), page 1. ACM, 2012. Google Scholar
  12. Wenfei Fan, Floris Geerts, and Leonid Libkin. On scale independence for querying big data. In Symposium on Principles of Database Systems (PODS), pages 51-62. ACM, 2014. Google Scholar
  13. Sumit Ganguly, Abraham Silberschatz, and Shalom Tsur. Parallel bottom-up processing of datalog queries. Journal of Logic Programming, 14(1&2):101-126, 1992. Google Scholar
  14. Joseph M. Hellerstein. The declarative imperative: experiences and conjectures in distributed logic. SIGMOD Record, 39(1):5-19, 2010. Google Scholar
  15. Paraschos Koutris and Dan Suciu. Parallel evaluation of conjunctive queries. In Symposium on Principles of Database Systems (PODS), pages 223-234, 2011. Google Scholar
  16. Alexandra Meliou, Wolfgang Gatterbauer, Joseph Y. Halpern, Christoph Koch, Katherine F. Moore, and Dan Suciu. Causality in databases. IEEE Data Engineering Bulletin, 33(3):59-67, 2010. Google Scholar
  17. Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, and Dan Suciu. The complexity of causality and responsibility for query answers and non-answers. Proceedings of the VLDB Endowmen (PVLDB), 4(1):34-45, 2010. Google Scholar
  18. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 15-28. USENIX Association, 2012. Google Scholar
  19. Daniel Zinn, Todd J. Green, and Bertram Ludäscher. Win-move is coordination-free (sometimes). In International Conference on Database Theory (ICDT), pages 99-113, 2012. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail