Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones

Authors Minos Garofalakis, Vasilis Samoladas



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2017.14.pdf
  • Filesize: 0.54 MB
  • 18 pages

Document Identifiers

Author Details

Minos Garofalakis
Vasilis Samoladas

Cite AsGet BibTex

Minos Garofalakis and Vasilis Samoladas. Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.ICDT.2017.14

Abstract

Continuous tracking of complex data analytics queries over high-speed distributed streams is becoming increasingly important. Query tracking can be reduced to continuous monitoring of a condition over the global stream. Communication-efficient monitoring relies on locally processing stream data at the sites where it is generated, by deriving site-local conditions which collectively guarantee the global condition. Recently proposed geometric techniques offer a generic approach for splitting an arbitrary global condition into local geometric monitoring constraints (known as "Safe Zones"); still, their application to various problem domains has so far been based on heuristics and lacking a principled, compositional methodology. In this paper, we present the first known formal results on the difficult problem of effective Safe Zone (SZ) design for complex query monitoring over distributed streams. Exploiting tools from convex analysis, our approach relies on an algebraic representation of SZs which allows us to: (1) Formally define the notion of a "good" SZ for distributed monitoring problems; and, most importantly, (2) Tackle and solve the important problem of systematically composing SZs for monitored conditions expressed as Boolean formulas over simpler conditions (for which SZs are known); furthermore, we prove that, under broad assumptions, the composed SZ is good if the component SZs are good. Our results are, therefore, a first step towards a principled compositional solution to SZ design for distributed query monitoring. Finally, we discuss a number of important applications for our SZ design algorithms, also demonstrating how earlier geometric techniques can be seen as special cases of our framework.
Keywords
  • distributed data streams
  • geometric method

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Phillip B. Gibbons, Yossi Matias, and Mario Szegedy. "Tracking Join and Self-Join Sizes in Limited Storage". In Proc. of the 18th ACM Symposium on Principles of Database Systems, Philadeplphia, Pennsylvania, May 1999. Google Scholar
  2. Noga Alon, Yossi Matias, and Mario Szegedy. "The Space Complexity of Approximating the Frequency Moments". In Proc. of the 28th Annual ACM Symposium on the Theory of Computing, pages 20-29, Philadelphia, Pennsylvania, May 1996. Google Scholar
  3. Chrisil Arackaparambil, Joshua Brody, and Amit Chakrabarti. Functional monitoring without monotonicity. In ICALP (1), 2009. URL: http://dx.doi.org/10.1007/978-3-642-02927-1_10.
  4. B. Babcock and C. Olston. Distributed top-k monitoring. In SIGMOD'03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, New York, NY, USA, 2003. ACM. URL: http://dx.doi.org/10.1145/872757.872764.
  5. Sabbas Burdakis and Antonios Deligiannakis. "Detecting Outliers in Sensor Networks Using the Geometric Approach". In Proc. of the 28th Intl. Conference on Data Engineering, April 2012. Google Scholar
  6. Sabbas Burdakis and Antonios Deligiannakis. Detecting outliers in sensor networks using the geometric approach. In ICDE, 2012. URL: http://dx.doi.org/10.1109/ICDE.2012.85.
  7. Graham Cormode and Minos Garofalakis. "Sketching Streams Through the Net: Distributed Approximate Query Tracking". In Proc. of the 31st Intl. Conference on Very Large Data Bases, Trondheim, Norway, September 2005. Google Scholar
  8. Graham Cormode and Minos Garofalakis. "Approximate Continuous Querying over Distributed Streams". ACM Transactions on Database Systems, 33(2), June 2008. Google Scholar
  9. Graham Cormode, Minos Garofalakis, Peter J. Haas, and Chris Jermaine. "Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches". Foundations and Trends in Databases, 4(1-3), 2012. Google Scholar
  10. Graham Cormode and Minos N. Garofalakis. Sketching streams through the net: Distributed approximate query tracking. In VLDB, 2005. Google Scholar
  11. Graham Cormode, S. Muthukrishnan, and Ke Yi. Algorithms for distributed functional monitoring. In SODA, 2008. URL: http://dx.doi.org/10.1145/1347082.1347200.
  12. Minos Garofalakis, Johannes Gehrke, and Rajeev Rastogi. "Data-Stream Management - Processing High-Speed Data Streams". Springer-Verlag New York (Data-Centric Systems and Applications Series), 2016. Google Scholar
  13. Minos Garofalakis, Daniel Keren, and Vasilis Samoladas. "Sketch-based Geometric Monitoring of Distributed Stream Queries". In Proc. of the 39th Intl. Conference on Very Large Data Bases, Trento, Italy, August 2013. Google Scholar
  14. Minos N. Garofalakis, Daniel Keren, and Vasilis Samoladas. Sketch-based geometric monitoring of distributed stream queries. PVLDB, 2013. Google Scholar
  15. Nikos Giatrakos, Antonios Deligiannakis, Minos Garofalakis, Izchak Sharfman, and Assaf Schuster. "Prediction-based Geometric Monitoring of Distributed Data Streams". In Proc. of the 2012 ACM SIGMOD Intl. Conference on Management of Data, Scottsdale, Arizona, May 2012. Google Scholar
  16. Rajeev Gupta, Krithi Ramamritham, and Mukesh K. Mohania. "Ratio threshold queries over distributed data sources". In Proc. of the 39th Intl. Conference on Very Large Data Bases, Trento, Italy, August 2013. Google Scholar
  17. Ling Huang, XuanLong Nguyen, Minos N. Garofalakis, Joseph M. Hellerstein, Michael I. Jordan, Anthony D. Joseph, and Nina Taft. Communication-efficient online detection of network-wide anomalies. In INFOCOM, 2007. URL: http://dx.doi.org/10.1109/INFCOM.2007.24.
  18. Srinivas R. Kashyap, Jeyashankher Ramamirtham, Rajeev Rastogi, and Pushpraj Shukla. Efficient constraint monitoring using adaptive thresholds. In ICDE, pages 526-535, 2008. URL: http://dx.doi.org/10.1109/ICDE.2008.4497461.
  19. Ram Keralapura, Graham Cormode, and Jeyashankher Ramamirtham. Communication-efficient distributed monitoring of thresholded counts. In SIGMOD, 2006. URL: http://dx.doi.org/10.1145/1142473.1142507.
  20. Daniel Keren, Guy Sagy, Amir Abboud, David Ben-David, Assaf Schuster, Izchak Sharfman, and Antonios Deligiannakis. "Geometric Monitoring of Heterogeneous Streams". IEEE Transactions on Knowledge and Data Engineering, 26(8), August 2014. Google Scholar
  21. Daniel Keren, Izchak Sharfman, Assaf Schuster, and Avishay Livne. Shape sensitive geometric monitoring. IEEE Trans. Knowl. Data Eng., 24(8), 2012. URL: http://dx.doi.org/10.1109/TKDE.2011.102.
  22. Arnon Lazerson, Daniel Keren, and Assaf Schuster. Lightweight monitoring of distributed streams. In Proc. of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD'16, pages 1685-1694, New York, NY, USA, 2016. ACM. URL: http://dx.doi.org/10.1145/2939672.2939820.
  23. Arnon Lazerson, Izchak Sharfman, Daniel Keren, Assaf Schuster, Minos Garofalakis, and Vasilis Samoladas. "Monitoring Distributed Streams using Convex Decompositions". In Proc. of the 41st Intl. Conference on Very Large Data Bases, August 2015. Google Scholar
  24. Shicong Meng, Ting Wang, and Ling Liu. Monitoring continuous state violation in datacenters: Exploring the time dimension. In ICDE, pages 968-979, 2010. URL: http://dx.doi.org/10.1109/ICDE.2010.5447923.
  25. Sebastian Michel, Peter Triantafillou, and Gerhard Weikum. Klee: a framework for distributed top-k query algorithms. In VLDB'05. VLDB Endowment, 2005. Google Scholar
  26. S. Muthukrishnan. "Data Streams: Algorithms and Applications". Foundations and Trends in Theoretical Computer Science, 1(2), 2005. Google Scholar
  27. Odysseas Papapetrou and Minos Garofalakis. "Continuous Fragmented Skylines over Distributed Streams". In Proc. of the 30th Intl. Conference on Data Engineering, Chicago, Illinois, April 2014. Google Scholar
  28. R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970. Google Scholar
  29. G. Sagy, D. Keren, I. Sharfman, and A. Schuster. "Distributed Threshold Querying of General Functions by a Difference of Monotonic Representation". In Proc. of the 36th Intl. Conference on Very Large Data Bases, August 2010. Google Scholar
  30. Shetal Shah and Krithi Ramamritham. Handling non-linear polynomial queries over dynamic data. In ICDE, 2008. URL: http://dx.doi.org/10.1109/ICDE.2008.4497513.
  31. Izchak Sharfman, Assaf Schuster, and Daniel Keren. "A geometric approach to monitoring threshold functions over distributed data streams". In SIGMOD, 2006. URL: http://dx.doi.org/10.1145/1142473.1142508.
  32. Izchak Sharfman, Assaf Schuster, and Daniel Keren. "A geometric approach to monitoring threshold functions over distributed data streams". ACM Trans. Database Syst., 32(4), 2007. URL: http://dx.doi.org/10.1145/1292609.1292613.
  33. Ran Wolff, Kanishka Bhaduri, and Hillol Kargupta. A generic local algorithm for mining data streams in large distributed systems. IEEE Trans. on Knowl. and Data Eng., 21(4), 2009. URL: http://dx.doi.org/10.1109/TKDE.2008.169.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail