Intermediate Value Linearizability: A Quantitative Correctness Criterion

Authors Arik Rinberg, Idit Keidar



PDF
Thumbnail PDF

File

LIPIcs.DISC.2020.2.pdf
  • Filesize: 0.61 MB
  • 17 pages

Document Identifiers

Author Details

Arik Rinberg
  • Technion - Israel Institute of Technology, Haifa, Israel
Idit Keidar
  • Technion - Israel Institute of Technology, Haifa, Israel

Acknowledgements

We thank Hagit Attiya and Gadi Taubenfeld for their insightful comments, and the anonymous reviewers for their detailed feedback.

Cite AsGet BibTex

Arik Rinberg and Idit Keidar. Intermediate Value Linearizability: A Quantitative Correctness Criterion. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 2:1-2:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.DISC.2020.2

Abstract

Big data processing systems often employ batched updates and data sketches to estimate certain properties of large data. For example, a CountMin sketch approximates the frequencies at which elements occur in a data stream, and a batched counter counts events in batches. This paper focuses on correctness criteria for concurrent implementations of such objects. Specifically, we consider quantitative objects, whose return values are from a totally ordered domain, with a particular emphasis on (ε,δ)-bounded objects that estimate a numerical quantity with an error of at most ε with probability at least 1 - δ. The de facto correctness criterion for concurrent objects is linearizability. Intuitively, under linearizability, when a read overlaps an update, it must return the object’s value either before the update or after it. Consider, for example, a single batched increment operation that counts three new events, bumping a batched counter’s value from 7 to 10. In a linearizable implementation of the counter, a read overlapping this update must return either 7 or 10. We observe, however, that in typical use cases, any intermediate value between 7 and 10 would also be acceptable. To capture this additional degree of freedom, we propose Intermediate Value Linearizability (IVL), a new correctness criterion that relaxes linearizability to allow returning intermediate values, for instance 8 in the example above. Roughly speaking, IVL allows reads to return any value that is bounded between two return values that are legal under linearizability. A key feature of IVL is that we can prove that concurrent IVL implementations of (ε,δ)-bounded objects are themselves (ε,δ)-bounded. To illustrate the power of this result, we give a straightforward and efficient concurrent implementation of an (ε, δ)-bounded CountMin sketch, which is IVL (albeit not linearizable). Finally, we show that IVL allows for inherently cheaper implementations than linearizable ones. In particular, we show a lower bound of Ω(n) on the step complexity of the update operation of any wait-free linearizable batched counter from single-writer objects, and propose a wait-free IVL implementation of the same object with an O(1) step complexity for update.

Subject Classification

ACM Subject Classification
  • Theory of computation → Semantics and reasoning
  • Computer systems organization → Parallel architectures
  • Computing methodologies → Concurrent computing methodologies
Keywords
  • concurrency
  • concurrent objects
  • linearizability

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Pankaj K Agarwal, Graham Cormode, Zengfeng Huang, Jeff M Phillips, Zhewei Wei, and Ke Yi. Mergeable summaries. ACM Transactions on Database Systems (TODS), 38(4):1-28, 2013. Google Scholar
  2. Dan Alistarh, Trevor Brown, Justin Kopinsky, Jerry Z Li, and Giorgi Nadiradze. Distributionally linearizable data structures. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 133-142. ACM, 2018. Google Scholar
  3. Hagit Attiya, Faith Ellen, and Panagiota Fatourou. The complexity of updating multi-writer snapshot objects. In International Conference on Distributed Computing and Networking, pages 319-330. Springer, 2006. Google Scholar
  4. Armando Castañeda, Sergio Rajsbaum, and Michel Raynal. Unifying concurrent objects and distributed tasks: Interval-linearizability. Journal of the ACM (JACM), 65(6):1-42, 2018. Google Scholar
  5. Jacek Cichon and Wojciech Macyna. Approximate counters for flash memory. In 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications, volume 1, pages 185-189. IEEE, 2011. Google Scholar
  6. Graham Cormode, Minos Garofalakis, Peter J Haas, and Chris Jermaine. Synopses for massive data: Samples, histograms, wavelets, sketches. Foundations and Trends in Databases, 4(1-3):1-294, 2012. Google Scholar
  7. Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58-75, 2005. Google Scholar
  8. Graham Cormode, Shanmugavelayutham Muthukrishnan, and Ke Yi. Algorithms for distributed functional monitoring. ACM Transactions on Algorithms (TALG), 7(2):1-20, 2011. Google Scholar
  9. Mayur Datar and Piotr Indyk. Comparing data streams using hamming norms. In Proceedings 2002 VLDB Conference: 28th International Conference on Very Large Databases (VLDB), page 335. Elsevier, 2002. Google Scholar
  10. Druid. Apache DataSketches (Incubating). https://incubator.apache.org/clutch/datasketches.html, Accessed May 14, 2020.
  11. Druid. Druid. https://druid.apache.org/blog/2014/02/18/hyperloglog-optimizations-for-real-world-systems.html, Accessed May 14, 2020.
  12. Philippe Flajolet. Approximate counting: a detailed analysis. BIT Numerical Mathematics, 25(1):113-134, 1985. Google Scholar
  13. Philippe Flajolet and G Nigel Martin. Probabilistic counting. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983), pages 76-82. IEEE, 1983. Google Scholar
  14. Phillip B Gibbons and Srikanta Tirthapura. Estimating simple functions on the union of data streams. In Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures, pages 281-291, 2001. Google Scholar
  15. Wojciech Golab, Lisa Higham, and Philipp Woelfel. Linearizable implementations do not suffice for randomized distributed computation. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 373-382, 2011. Google Scholar
  16. Thomas A Henzinger, Christoph M Kirsch, Hannes Payer, Ali Sezgin, and Ana Sokolova. Quantitative relaxation of concurrent data structures. In Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 317-328, 2013. Google Scholar
  17. Maurice P Herlihy and Jeannette M Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12(3):463-492, 1990. Google Scholar
  18. Stefan Heule, Marc Nunkesser, and Alexander Hall. Hyperloglog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm. In Proceedings of the 16th International Conference on Extending Database Technology, pages 683-692, 2013. Google Scholar
  19. Hillview. Hillview: A Big Data Spreadsheet. https://research.vmware.com/projects/hillview, Accessed May 14, 2020.
  20. Jaap-Henk Hoepman and John Tromp. Binary snapshots. In International Workshop on Distributed Algorithms, pages 18-25. Springer, 1993. Google Scholar
  21. Amos Israeli and Asaf Shirazi. The time complexity of updating snapshot memories. Information Processing Letters, 65(1):33-40, 1998. Google Scholar
  22. Leslie Lamport. On interprocess communication. Distributed computing, 1(2):86-101, 1986. Google Scholar
  23. Leslie Lamport. Concurrent reading and writing of clocks. ACM Transactions on Computer Systems (TOCS), 8(4):305-310, 1990. Google Scholar
  24. Zaoxing Liu, Antonis Manousis, Gregory Vorsanger, Vyas Sekar, and Vladimir Braverman. One sketch to rule them all: Rethinking network flow monitoring with univmon. In Proceedings of the 2016 ACM SIGCOMM Conference, pages 101-114, 2016. Google Scholar
  25. Nihar R Mahapatra and Balakrishna Venkatrao. The processor-memory bottleneck: problems and solutions. XRDS: Crossroads, The ACM Magazine for Students, 5(3es):2, 1999. Google Scholar
  26. Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. In International Conference on Database Theory, pages 398-412. Springer, 2005. Google Scholar
  27. Robert Morris. Counting large numbers of events in small registers. Communications of the ACM, 21(10):840-842, 1978. Google Scholar
  28. Gil Neiger. Set-linearizability. In Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing, page 396, 1994. Google Scholar
  29. Sean Ovens and Philipp Woelfel. Strongly linearizable implementations of snapshots and other types. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 197-206, 2019. Google Scholar
  30. Presto. HyperLogLog in Presto: A significantly faster way to handle cardinality estimation. https://engineering.fb.com/data-infrastructure/hyperloglog/, Accessed May 14, 2020.
  31. Arik Rinberg and Idit Keidar. Intermediate value linearizability: A quantitative correctness criterion. arXiv preprint arXiv:2006.12889, 2020. Google Scholar
  32. Arik Rinberg, Alexander Spiegelman, Edward Bortnikov, Eshcar Hillel, Idit Keidar, Lee Rhodes, and Hadar Serviansky. Fast concurrent data sketches. In Proceedings of the 2020 ACM Symposium on Principles and Practice of Parallel Programming. ACM, 2020. Google Scholar
  33. Charalampos Stylianopoulos, Ivan Walulya, Magnus Almgren, Olaf Landsiedel, and Marina Papatriantafilou. Delegation sketch: a parallel design with support for fast and accurate concurrent operations. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1-16, 2020. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail