Intermediate Value Linearizability: A Quantitative Correctness Criterion

Rinberg, Arik; Keidar, Idit

doi:10.4230/LIPIcs.DISC.2020.2

File

Subject Classification

ACM Subject Classification

Theory of computation → Semantics and reasoning
Computer systems organization → Parallel architectures
Computing methodologies → Concurrent computing methodologies

Keywords

concurrency
concurrent objects
linearizability

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

Big data processing systems often employ batched updates and data sketches to estimate certain properties of large data. For example, a CountMin sketch approximates the frequencies at which elements occur in a data stream, and a batched counter counts events in batches. This paper focuses on correctness criteria for concurrent implementations of such objects. Specifically, we consider quantitative objects, whose return values are from a totally ordered domain, with a particular emphasis on (ε,δ)-bounded objects that estimate a numerical quantity with an error of at most ε with probability at least 1 - δ. The de facto correctness criterion for concurrent objects is linearizability. Intuitively, under linearizability, when a read overlaps an update, it must return the object’s value either before the update or after it. Consider, for example, a single batched increment operation that counts three new events, bumping a batched counter’s value from 7 to 10. In a linearizable implementation of the counter, a read overlapping this update must return either 7 or 10. We observe, however, that in typical use cases, any intermediate value between 7 and 10 would also be acceptable. To capture this additional degree of freedom, we propose Intermediate Value Linearizability (IVL), a new correctness criterion that relaxes linearizability to allow returning intermediate values, for instance 8 in the example above. Roughly speaking, IVL allows reads to return any value that is bounded between two return values that are legal under linearizability. A key feature of IVL is that we can prove that concurrent IVL implementations of (ε,δ)-bounded objects are themselves (ε,δ)-bounded. To illustrate the power of this result, we give a straightforward and efficient concurrent implementation of an (ε, δ)-bounded CountMin sketch, which is IVL (albeit not linearizable). Finally, we show that IVL allows for inherently cheaper implementations than linearizable ones. In particular, we show a lower bound of Ω(n) on the step complexity of the update operation of any wait-free linearizable batched counter from single-writer objects, and propose a wait-free IVL implementation of the same object with an O(1) step complexity for update.

Cite As Get BibTex

Arik Rinberg and Idit Keidar. Intermediate Value Linearizability: A Quantitative Correctness Criterion. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 2:1-2:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.DISC.2020.2

Author Details

Arik Rinberg

Technion - Israel Institute of Technology, Haifa, Israel

Idit Keidar

Technion - Israel Institute of Technology, Haifa, Israel

References

Pankaj K Agarwal, Graham Cormode, Zengfeng Huang, Jeff M Phillips, Zhewei Wei, and Ke Yi. Mergeable summaries. ACM Transactions on Database Systems (TODS), 38(4):1-28, 2013.
Dan Alistarh, Trevor Brown, Justin Kopinsky, Jerry Z Li, and Giorgi Nadiradze. Distributionally linearizable data structures. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 133-142. ACM, 2018.
Hagit Attiya, Faith Ellen, and Panagiota Fatourou. The complexity of updating multi-writer snapshot objects. In International Conference on Distributed Computing and Networking, pages 319-330. Springer, 2006.
Armando Castañeda, Sergio Rajsbaum, and Michel Raynal. Unifying concurrent objects and distributed tasks: Interval-linearizability. Journal of the ACM (JACM), 65(6):1-42, 2018.
Jacek Cichon and Wojciech Macyna. Approximate counters for flash memory. In 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications, volume 1, pages 185-189. IEEE, 2011.
Graham Cormode, Minos Garofalakis, Peter J Haas, and Chris Jermaine. Synopses for massive data: Samples, histograms, wavelets, sketches. Foundations and Trends in Databases, 4(1-3):1-294, 2012.
Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58-75, 2005.
Graham Cormode, Shanmugavelayutham Muthukrishnan, and Ke Yi. Algorithms for distributed functional monitoring. ACM Transactions on Algorithms (TALG), 7(2):1-20, 2011.
Mayur Datar and Piotr Indyk. Comparing data streams using hamming norms. In Proceedings 2002 VLDB Conference: 28th International Conference on Very Large Databases (VLDB), page 335. Elsevier, 2002.
Druid. Apache DataSketches (Incubating). https://incubator.apache.org/clutch/datasketches.html, Accessed May 14, 2020.
Druid. Druid. https://druid.apache.org/blog/2014/02/18/hyperloglog-optimizations-for-real-world-systems.html, Accessed May 14, 2020.
Philippe Flajolet. Approximate counting: a detailed analysis. BIT Numerical Mathematics, 25(1):113-134, 1985.
Philippe Flajolet and G Nigel Martin. Probabilistic counting. In 24th Annual Symposium on Foundations of Computer Science (sfcs 1983), pages 76-82. IEEE, 1983.
Phillip B Gibbons and Srikanta Tirthapura. Estimating simple functions on the union of data streams. In Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures, pages 281-291, 2001.
Wojciech Golab, Lisa Higham, and Philipp Woelfel. Linearizable implementations do not suffice for randomized distributed computation. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 373-382, 2011.
Thomas A Henzinger, Christoph M Kirsch, Hannes Payer, Ali Sezgin, and Ana Sokolova. Quantitative relaxation of concurrent data structures. In Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 317-328, 2013.
Maurice P Herlihy and Jeannette M Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12(3):463-492, 1990.
Stefan Heule, Marc Nunkesser, and Alexander Hall. Hyperloglog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm. In Proceedings of the 16th International Conference on Extending Database Technology, pages 683-692, 2013.
Hillview. Hillview: A Big Data Spreadsheet. https://research.vmware.com/projects/hillview, Accessed May 14, 2020.
Jaap-Henk Hoepman and John Tromp. Binary snapshots. In International Workshop on Distributed Algorithms, pages 18-25. Springer, 1993.
Amos Israeli and Asaf Shirazi. The time complexity of updating snapshot memories. Information Processing Letters, 65(1):33-40, 1998.
Leslie Lamport. On interprocess communication. Distributed computing, 1(2):86-101, 1986.
Leslie Lamport. Concurrent reading and writing of clocks. ACM Transactions on Computer Systems (TOCS), 8(4):305-310, 1990.
Zaoxing Liu, Antonis Manousis, Gregory Vorsanger, Vyas Sekar, and Vladimir Braverman. One sketch to rule them all: Rethinking network flow monitoring with univmon. In Proceedings of the 2016 ACM SIGCOMM Conference, pages 101-114, 2016.
Nihar R Mahapatra and Balakrishna Venkatrao. The processor-memory bottleneck: problems and solutions. XRDS: Crossroads, The ACM Magazine for Students, 5(3es):2, 1999.
Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. In International Conference on Database Theory, pages 398-412. Springer, 2005.
Robert Morris. Counting large numbers of events in small registers. Communications of the ACM, 21(10):840-842, 1978.
Gil Neiger. Set-linearizability. In Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing, page 396, 1994.
Sean Ovens and Philipp Woelfel. Strongly linearizable implementations of snapshots and other types. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 197-206, 2019.
Presto. HyperLogLog in Presto: A significantly faster way to handle cardinality estimation. https://engineering.fb.com/data-infrastructure/hyperloglog/, Accessed May 14, 2020.
Arik Rinberg and Idit Keidar. Intermediate value linearizability: A quantitative correctness criterion. arXiv preprint arXiv:2006.12889, 2020.
Arik Rinberg, Alexander Spiegelman, Edward Bortnikov, Eshcar Hillel, Idit Keidar, Lee Rhodes, and Hadar Serviansky. Fast concurrent data sketches. In Proceedings of the 2020 ACM Symposium on Principles and Practice of Parallel Programming. ACM, 2020.
Charalampos Stylianopoulos, Ivan Walulya, Magnus Almgren, Olaf Landsiedel, and Marina Papatriantafilou. Delegation sketch: a parallel design with support for fast and accurate concurrent operations. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1-16, 2020.

Intermediate Value Linearizability: A Quantitative Correctness Criterion

Authors Arik Rinberg, Idit Keidar

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

Intermediate Value Linearizability: A Quantitative Correctness Criterion

Authors Arik Rinberg, Idit Keidar

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message