On the Impact of Provenance Semiring Theory on the Design of a Provenance-Aware Database System

Author Pierre Senellart



PDF
Thumbnail PDF

File

OASIcs.Tannen.9.pdf
  • Filesize: 0.62 MB
  • 10 pages

Document Identifiers

Author Details

Pierre Senellart
  • DI ENS, ENS, PSL University, CNRS, Paris, France
  • Inria, Paris, France
  • Institut Universitaire de France, Paris, France
  • CNRS@CREATE LTD, Singapore
  • IPAL, CNRS, Singapore

Acknowledgements

ProvSQL is a collective effort; I acknowledge the contributions of Yann Ramusat, Silviu Maniu, Louis Jachiet, and Baptiste Lafosse to the development of the system.

Cite AsGet BibTex

Pierre Senellart. On the Impact of Provenance Semiring Theory on the Design of a Provenance-Aware Database System. In The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen. Open Access Series in Informatics (OASIcs), Volume 119, pp. 9:1-9:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/OASIcs.Tannen.9

Abstract

We report on the impact that the theory of provenance semirings, developed by Val Tannen and his collaborators, has had on the design on a practical system for maintaining the provenance of query results over a relational database, namely ProvSQL.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data provenance
  • Information systems → Database management system engines
Keywords
  • provenance
  • provenance semiring
  • ProvSQL

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
  2. Yael Amsterdamer, Daniel Deutch, and Val Tannen. On the limitations of provenance for queries with difference. In Peter Buneman and Juliana Freire, editors, 3rd Workshop on the Theory and Practice of Provenance, TaPP'11, Heraklion, Crete, Greece, June 20-21, 2011. USENIX Association, 2011. URL: https://www.usenix.org/conference/tapp11/limitations-provenance-queries-difference.
  3. Yael Amsterdamer, Daniel Deutch, and Val Tannen. Provenance for aggregate queries. In Maurizio Lenzerini and Thomas Schwentick, editors, Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2011, June 12-16, 2011, Athens, Greece, pages 153-164. ACM, 2011. URL: https://doi.org/10.1145/1989284.1989302.
  4. Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. Why and where: A characterization of data provenance. In Jan Van den Bussche and Victor Vianu, editors, Database Theory - ICDT 2001, 8th International Conference, London, UK, January 4-6, 2001, Proceedings, volume 1973 of Lecture Notes in Computer Science, pages 316-330. Springer, 2001. URL: https://doi.org/10.1007/3-540-44503-X_20.
  5. Yingwei Cui and Jennifer Widom. Practical lineage tracing in data warehouses. In David B. Lomet and Gerhard Weikum, editors, Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28 - March 3, 2000, pages 367-378. IEEE Computer Society, 2000. URL: https://doi.org/10.1109/ICDE.2000.839437.
  6. Katrin M. Dannert, Erich Grädel, Matthias Naaf, and Val Tannen. Semiring provenance for fixed-point logic. In Christel Baier and Jean Goubault-Larrecq, editors, 29th EACSL Annual Conference on Computer Science Logic, CSL 2021, January 25-28, 2021, Ljubljana, Slovenia (Virtual Conference), volume 183 of LIPIcs, pages 17:1-17:22. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. URL: https://doi.org/10.4230/LIPIcs.CSL.2021.17.
  7. Daniel Deutch, Tova Milo, Sudeepa Roy, and Val Tannen. Circuits for Datalog provenance. In Nicole Schweikardt, Vassilis Christophides, and Vincent Leroy, editors, Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24-28, 2014, pages 201-212. OpenProceedings.org, 2014. URL: https://doi.org/10.5441/002/icdt.2014.22.
  8. Floris Geerts and Antonella Poggi. On database query languages for k-relations. J. Appl. Log., 8(2):173-185, 2010. URL: https://doi.org/10.1016/j.jal.2009.09.001.
  9. Todd J. Green, Gregory Karvounarakis, and Val Tannen. Provenance semirings. In Leonid Libkin, editor, Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 11-13, 2007, Beijing, China, pages 31-40. ACM, 2007. URL: https://doi.org/10.1145/1265530.1265535.
  10. Todd J. Green and Val Tannen. Models for incomplete and probabilistic information. In Torsten Grust, Hagen Höpfner, Arantza Illarramendi, Stefan Jablonski, Marco Mesiti, Sascha Müller, Paula-Lavinia Patranjan, Kai-Uwe Sattler, Myra Spiliopoulou, and Jef Wijsen, editors, Current Trends in Database Technology - EDBT 2006, EDBT 2006 Workshops PhD, DataX, IIDB, IIHA, ICSNW, QLQP, PIM, PaRMA, and Reactivity on the Web, Munich, Germany, March 26-31, 2006, Revised Selected Papers, volume 4254 of Lecture Notes in Computer Science, pages 278-296. Springer, 2006. URL: https://doi.org/10.1007/11896548_24.
  11. Todd J. Green and Val Tannen. The semiring framework for database provenance. In Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts, editors, Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 93-99. ACM, 2017. URL: https://doi.org/10.1145/3034786.3056125.
  12. Tomasz Imielinski and Witold Lipski Jr. Incomplete information in relational databases. J. ACM, 31(4):761-791, 1984. URL: https://doi.org/10.1145/1634.1886.
  13. Pratik Karmakar, Mikaël Monet, Pierre Senellart, and Stéphane Bressan. Expected Shapley-like scores of boolean functions: Complexity and applications to probabilistic databases. Proc. ACM Manag. Data, 2(2 (PODS)), 2024. URL: https://doi.org/10.1145/3651593.
  14. Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, and Yisu Remy Wang. Convergence of datalog over (pre-) semirings. SIGMOD Rec., 52(1):75-82, 2023. URL: https://doi.org/10.1145/3604437.3604454.
  15. Yann Ramusat, Silviu Maniu, and Pierre Senellart. Provenance-based algorithms for rich queries over graph databases. In Yannis Velegrakis, Demetris Zeinalipour-Yazti, Panos K. Chrysanthis, and Francesco Guerra, editors, Proceedings of the 24th International Conference on Extending Database Technology, EDBT 2021, Nicosia, Cyprus, March 23 - 26, 2021, pages 73-84. OpenProceedings.org, 2021. URL: https://doi.org/10.5441/002/EDBT.2021.08.
  16. Yann Ramusat, Silviu Maniu, and Pierre Senellart. Efficient provenance-aware querying of graph databases with datalog. In Vasiliki Kalavri and Semih Salihoglu, editors, GRADES-NDA '22: Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), Philadelphia, Pennsylvania, USA, 12 June 2022, pages 4:1-4:9. ACM, 2022. URL: https://doi.org/10.1145/3534540.3534689.
  17. Pierre Senellart. Provenance and probabilities in relational databases. SIGMOD Rec., 46(4):5-15, 2017. URL: https://doi.org/10.1145/3186549.3186551.
  18. Pierre Senellart. Provenance in databases: Principles and applications. In Markus Krötzsch and Daria Stepanova, editors, Reasoning Web. Explainable Artificial Intelligence - 15th International Summer School 2019, Bolzano, Italy, September 20-24, 2019, Tutorial Lectures, volume 11810 of Lecture Notes in Computer Science, pages 104-109. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-31423-1_3.
  19. Pierre Senellart. ProvSQL. https://github.com/PierreSenellart/provsql, 2024.
  20. Pierre Senellart, Louis Jachiet, Silviu Maniu, and Yann Ramusat. ProvSQL: Provenance and probability management in PostgreSQL. Proc. VLDB Endow., 11(12):2034-2037, 2018. URL: https://doi.org/10.14778/3229863.3236253.
  21. Dan Suciu, Dan Olteanu, Christopher Ré, and Christoph Koch. Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. URL: https://doi.org/10.2200/S00362ED1V01Y201105DTM016.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail