Sketching the Path to Efficiency: Lightweight Learned Cache Replacement

Authors Rana Shahout , Roy Friedman

Thumbnail PDF


  • Filesize: 1.4 MB
  • 21 pages

Document Identifiers

Author Details

Rana Shahout
  • Harvard University, Cambrdige, MA, USA
Roy Friedman
  • Technion, Haifa, Israel


We thank Ohad Eytan for helping run Caffeine’s simulator.

Cite AsGet BibTex

Rana Shahout and Roy Friedman. Sketching the Path to Efficiency: Lightweight Learned Cache Replacement. In 27th International Conference on Principles of Distributed Systems (OPODIS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 286, pp. 34:1-34:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Cache management policies are responsible for selecting the items that should be kept in the cache, and are therefore a fundamental design choice for obtaining an effective caching solution. Heuristic approaches have been used to identify access patterns that affect cache management decisions. However, their behavior is inconsistent, as they can perform well for certain access patterns and poorly for others. Given machine learning’s (ML) remarkable achievements in predicting diverse problems, ML techniques can be applied to create a cache management policy. Yet a significant challenge arises from the memory overhead associated with ML components. These components retain per item information and must be invoked on each access, contradicting the goal of minimizing the cache’s resource signature. In this work, we propose ALPS, a light-weight cache management policy that takes into account the cost of the ML component. ALPS combines ML with traditional heuristic-based approaches and facilitates learning by identifying several statistical features derived from space-efficient sketches. ALPS’s ML process derives its features from these sketches, resulting in a lightweight and highly effective meta-policy for cache management. We evaluate our approach over real-world workloads run against five popular heuristic cache management policies as well as a state-of-the-art ML-based policy. In our experiments, ALPS always obtained the best hit ratio. Specifically, ALPS improves the hit ratio compared to LRU by up to 20%, Hyperbolic by up to 31%, ARC by up to 9% and W-TinyLFU by up to 26% on various real-world workloads. Its resource requirements are orders of magnitude lower than previous ML-based approaches.

Subject Classification

ACM Subject Classification
  • Information systems → Data streams
  • Data streams
  • Memory Management
  • Cache Policy
  • ML


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Charu C Aggarwal. A framework for Diagnosing Changes in Evolving Data Streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 575-586, 2003. URL:
  2. Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich, and Tai Jin. Evaluating Content Management Techniques for Web Proxy Caches. In In Proc. of the 2nd Workshop on Internet Server Performance, 1999. Google Scholar
  3. Martin Arlitt, Rich Friedrich, and Tai Jin. Performance Evaluation of Web Proxy Cache Replacement Policies. Perform. Eval., 39(1-4):149-164, feb 2000. URL:
  4. Sorav Bansal and Dharmendra S. Modha. CAR: Clock with Adaptive Replacement. In Proc. of the 3rd USENIX Conf. on File and Storage Technologies (FAST), pages 187-200, 2004. URL:
  5. Nathan Beckmann, Haoxian Chen, and Asaf Cidon. LHD: Improving Cache Hit Rate by Maximizing Hit Density. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 389-403, 2018. URL:
  6. Nathan Beckmann and Daniel Sanchez. Talus: A Simple Way to Remove Cliffs in Cache Performance. In IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pages 64-75, 2015. URL:
  7. L. A. Belady. A Study of Replacement Algorithms for a Virtual-Storage Computer. IBM Systems Journal, 5(2):78-101, 1966. URL:
  8. Ran Ben-Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Heavy Hitters in Streams and Sliding Windows. In The 35th Annual IEEE International Conference on Computer Communications (INFOCOM), pages 1-9, 2016. URL:
  9. Aaron Blankstein, Siddhartha Sen, and Michael J. Freedman. Hyperbolic Caching: Flexible Caching for Web Applications. In 2017 USENIX Annual Technical Conference (USENIX ATC), pages 499-511, 2017. URL:
  10. Sara Bouchenak, Alan Cox, Steven Dropsho, Sumit Mittal, and Willy Zwaenepoel. Caching Dynamic Web Content: Designing and Analysing an Aspect-Oriented Solution. In ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware), 2006. URL:
  11. Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. Web Caching and Zipf-like Distributions: Evidence and Implications. In Proc. of the 18th Annual Joint Conf. of the IEEE Computer and Communications Societies (INFOCOM), pages 126-134, 1999. URL:
  12. Lixia Chen, Jian Li, Ruhui Ma, Haibing Guan, and Hans-Arno Jacobsen. EnclaveCache: A Secure and Scalable Key-Value Cache in Multi-Tenant Clouds Using Intel SGX. In Proc. of the 20th ACM/IFIP International Middleware Conference, pages 14-27, 2019. URL:
  13. Ludmila Cherkasova. Improving WWW Proxies Performance with Greedy-Dual-Size-Frequency Caching Policy. Technical report, In HP Tech. Report, 1998. Google Scholar
  14. Gregory V. Chockler, Danny Dolev, Roy Friedman, and Roman Vitenberg. Implementing a Caching Service for Distributed CORBA Objects. In IFIP/ACM International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware), 2000. URL:
  15. Wonil Choi, Bhuvan Urgaonkar, Mahmut Taylan Kandemir, and George Kesidis. Multi-Resource Fair Allocation for Consolidated Flash-Based Caching Systems. In Proceedings of the 23rd ACM/IFIP International Middleware Conference, pages 202-215, 2022. URL:
  16. Graham Cormode and S. Muthukrishnan. An Improved Data Stream Summary: The Count-min Sketch and Its Applications. Journal of Algorithms, 55(1):58-75, apr 2005. URL:
  17. Louis Degenaro, Arun Iyengar, Ilya Lipkind, and Isabelle Rouvellou. A Middleware System Which Intelligently Caches Query Results. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), pages 24-44, 2000. URL:
  18. Dgraph. Ristretto: A High Performance Memory-Bound Go Cache, 2020. URL:
  19. Xiaoming Du and Cong Li. SHARC: Improving Adaptive Replacement Cache with Shadow Recency Cache Management. In Proc. of the 22nd ACM/IFIP International Middleware Conference, pages 119-131, 2021. URL:
  20. G. Einziger, R. Friedman, and B. Manes. TinyLFU: A Highly Efficient Cache Admission Policy. ACM Transactions on Storage (TOS), 2017. URL:
  21. Gil Einziger, Ohad Eytan, Roy Friedman, and Ben Manes. Adaptive Software Cache Management. In Proceedings of the 19th International Middleware Conference, pages 94-106, 2018. URL:
  22. Philippe Flajolet, Éric Fusy, Olivier Gandouet, and Frédéric Meunier. Hyperloglog: the Analysis of a Near-Optimal Cardinality Estimation Algorithm. In Discrete Mathematics and Theoretical Computer Science, pages 137-156, 2007. Google Scholar
  23. Priya Gupta, Nickolai Zeldovich, and Samuel Madden. A Trigger-Based Middleware Cache for ORMs. In ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware), 2011. URL:
  24. Simon Haykin. Neural Networks: a Comprehensive Foundation. Prentice Hall PTR, 1994. Google Scholar
  25. John L. Hennessy and David A. Patterson. Computer Architecture - A Quantitative Approach (5. ed.). Morgan Kaufmann, 2012. Google Scholar
  26. Song Jiang, Feng Chen, and Xiaodong Zhang. CLOCK-Pro: an Effective Improvement of the CLOCK Replacement. In Proc. of the USENIX Annual Technical Conference (ATC), 2005. URL:
  27. Song Jiang and Xiaodong Zhang. LIRS: an Efficient Low Inter-Reference Recency Set Replacement Policy to Improve Buffer Cache Performance. In Proc. of the International Conference on Measurements and Modeling of Computer Systems SIGMETRICS, pages 31-42, jun 2002. URL:
  28. Theodore Johnson and Dennis Shasha. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. In Proc. of the 20th Int. Conf. on Very Large Data Bases (VLDB), pages 439-450, 1994. URL:
  29. G. Karakostas and D. N. Serpanos. Exploitation of Different Types of Locality for Web Caches. In Proc. of the 7th Int. Symposium on Computers and Communications (ISCC), pages 207-212. IEEE, 2002. URL:
  30. Ramakrishna Karedla, J Spencer Love, and Bradley G Wherry. Caching Strategies to Improve Disk System Performance. Computer, 27(3):38-46, 1994. URL:
  31. Tatsuya Kawano. A High Performance Concurrent Caching Library for Rust, 2021. URL:
  32. Daniel Kifer, Shai Ben-David, and Johannes Gehrke. Detecting Change in Data Streams. In VLDB, volume 4, pages 180-191, 2004. URL:
  33. Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong-Sang Kim. LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Trans. Computers, 50(12):1352-1361, 2001. URL:
  34. Cheng Li, Philip Shilane, Fred Douglis, and Grant Wallace. Pannier: Design and Analysis of a Container-Based Flash Cache for Compound Objects. ACM Trans. on Storage (ToN), 13(3), sep 2017. A preliminary version appeared in ACM/IFIP Middleware 2015. URL:
  35. Tanu Malik, Xiaodan Wang, Philip Little, Amitabh Chaudhary, and Ani Thakar. A Dynamic Data Middleware Cache for Rapidly-Growing Scientific Repositories. In Proc. of the ACM/IFIP/USENIX 11th International Conference on Middleware, pages 64-84, 2010. URL:
  36. Ben Manes. Caffeine: A High Performance Caching Library for Java 8., 2017.
  37. Nimrod Megiddo and Dharmendra S. Modha. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Proc. of the 2nd USENIX Conf. on File and Storage Technologies (FAST), pages 115-130, 2003. URL:
  38. Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient Computation of Frequent and Top-K Elements in Data Streams. In International Conference on Database Theory, pages 398-412. Springer, 2005. URL:
  39. Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum. The LRU-K Page Replacement Algorithm for Database Disk Buffering. ACM SIGMOD Rec., 22(2):297-306, jun 1993. URL:
  40. Sejin Park and Chanik Park. FRD: A Filtering Based Buffer Cache Algorithm that Considers both Frequency and Reuse Distance. In Proc. of the 33rd IEEE International Conference on Massive Storage Systems and Technology (MSST), 2017. Google Scholar
  41. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. dquotesingle Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024-8035. Curran Associates, Inc., 2019. URL:
  42. Redis-Labs. Using Redis as an LRU cache, 2020. URL:
  43. Liana V. Rodriguez, Farzana Yusuf, Steven Lyons, Eysler Paz, Raju Rangaswami, Jason Liu, Ming Zhao, and Giri Narasimhan. Learning Cache Replacement with CACHEUS. In 19th USENIX Conference on File and Storage Technologies (FAST), pages 341-354, 2021. URL:
  44. Dimitrios N Serpanos, George Karakostas, and Wayne Hendrix Wolf. Effective Caching of Web Objects Using Zipf’s Law. In IEEE International Conference on Multimedia and Expo (ICME): Latest Advances in the Fast Changing World of Multimedia (Cat. No. 00TH8532), volume 2, pages 727-730, 2000. URL:
  45. Rana Shahout. Open Source Code. URL:
  46. Zhenyu Song. webcachesim2: A Simulator for CDN Caching and Web Caching Policies, 2019. URL:
  47. Zhenyu Song, Daniel S Berger, Kai Li, and Wyatt Lloyd. Learning Relaxed Belady for Content Distribution Network Caching. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 529-544, 2020. URL:
  48. Systems Research Laboratory (SyLab). Cacheus Project, 2021. URL:
  49. Yingying Tao and M Tamer Ozsu. Mining Data Streams with Periodically Changing Distributions. In Proceedings of the 18th ACM conference on Information and Knowledge Management, pages 887-896, 2009. URL:
  50. Guido Urdaneta, Guillaume Pierre, and Maarten Van Steen. Wikipedia Workload Analysis for Decentralized Hosting. Computer Networks, 53(11):1830-1845, 2009. URL:
  51. Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. Cache Modeling and Optimization using Miniature Simulations. In USENIX Annual Technical Conference (ATC), pages 487-498, 2017. URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail