The Complexity of the Shapley Value for Regular Path Queries

Authors Majd Khalil, Benny Kimelfeld



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2023.11.pdf
  • Filesize: 0.74 MB
  • 19 pages

Document Identifiers

Author Details

Majd Khalil
  • Technion, Haifa, Israel
Benny Kimelfeld
  • Technion, Haifa, Israel

Cite AsGet BibTex

Majd Khalil and Benny Kimelfeld. The Complexity of the Shapley Value for Regular Path Queries. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 11:1-11:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ICDT.2023.11

Abstract

A path query extracts vertex tuples from a labeled graph, based on the words that are formed by the paths connecting the vertices. We study the computational complexity of measuring the contribution of edges and vertices to an answer to a path query, focusing on the class of conjunctive regular path queries. To measure this contribution, we adopt the traditional Shapley value from cooperative game theory. This value has been recently proposed and studied in the context of relational database queries and has uses in a plethora of other domains. We first study the contribution of edges and show that the exact Shapley value is almost always hard to compute. Specifically, it is #P-hard to calculate the contribution of an edge whenever at least one (non-redundant) conjunct allows for a word of length three or more. In the case of regular path queries (i.e., no conjunction), the problem is tractable if the query has only words of length at most two; hence, this property fully characterizes the tractability of the problem. On the other hand, if we allow for an approximation error, then it is straightforward to obtain an efficient scheme (FPRAS) for an additive approximation. Yet, a multiplicative approximation is harder to obtain. We establish that in the case of conjunctive regular path queries, a multiplicative approximation of the Shapley value of an edge can be computed in polynomial time if and only if all query atoms are finite languages (assuming non-redundancy and conventional complexity limitations). We also study the analogous situation where we wish to determine the contribution of a vertex, rather than an edge, and establish complexity results of similar nature.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data provenance
Keywords
  • Path queries
  • regular path queries
  • graph databases
  • Shapley value

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Manish Kumar Anand, Shawn Bowers, and Bertram Ludäscher. Techniques for efficiently querying scientific workflow provenance graphs. In EDBT, volume 426, pages 287-298. ACM, 2010. URL: https://doi.org/10.1145/1739041.1739078.
  2. Marcelo Arenas and Jorge Pérez. Querying semantic web data with SPARQL. In Maurizio Lenzerini and Thomas Schwentick, editors, PODS, pages 305-316. ACM, 2011. URL: https://doi.org/10.1145/1989284.1989312.
  3. Pablo Barceló Baeza. Querying graph databases. In Richard Hull and Wenfei Fan, editors, PODS, pages 175-188. ACM, 2013. URL: https://doi.org/10.1145/2463664.2465216.
  4. Pablo Barceló, Leonid Libkin, Anthony W Lin, and Peter T Wood. Expressive languages for path queries over graph-structured data. ACM Transactions on Database Systems (TODS), 37(4):1-46, 2012. Google Scholar
  5. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y. Vardi. Rewriting of regular expressions and regular path queries. In PODS, pages 194-204. ACM, 1999. URL: https://doi.org/10.1145/303976.303996.
  6. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR, pages 176-185. Morgan Kaufmann, 2000. Google Scholar
  7. Mariano P. Consens and Alberto O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, pages 404-416. ACM, 1990. URL: https://doi.org/10.1145/298514.298591.
  8. Isabel F. Cruz, Alberto O. Mendelzon, and Peter T. Wood. A graphical query language supporting recursion. In SIGMOD, pages 323-330. ACM, 1987. URL: https://doi.org/10.1145/38713.38749.
  9. Daniel Deutch, Nave Frost, Benny Kimelfeld, and Mikaël Monet. Computing the Shapley value of facts in query answering. In SIGMOD, pages 1570-1583. ACM, 2022. Google Scholar
  10. Wenfei Fan. Graph pattern matching revised for social network analysis. In ICDT, pages 8-21. ACM, 2012. URL: https://doi.org/10.1145/2274576.2274578.
  11. Daniela Florescu, Alon Y. Levy, and Dan Suciu. Query containment for conjunctive queries with regular expressions. In Alberto O. Mendelzon and Jan Paredaens, editors, PODS, pages 139-148. ACM, 1998. URL: https://doi.org/10.1145/275487.275503.
  12. Steven Fortune, John E. Hopcroft, and James Wyllie. The directed subgraph homeomorphism problem. Theor. Comput. Sci., 10:111-121, 1980. URL: https://doi.org/10.1016/0304-3975(80)90009-2.
  13. Joseph Y. Halpern and Judea Pearl. Causes and explanations: A structural-model approach: Part 1: Causes. In UAI, pages 194-202, 2001. Google Scholar
  14. Anthony Hunter and Sébastien Konieczny. On the measure of conflicts: Shapley inconsistency values. Artif. Intell., 174(14):1007-1026, 2010. Google Scholar
  15. Majd Khalil and Benny Kimelfeld. The complexity of the Shapley value for regular path queries, 2022. Google Scholar
  16. Ester Livshits, Leopoldo E. Bertossi, Benny Kimelfeld, and Moshe Sebag. Query games in databases. SIGMOD Rec., 50(1):78-85, 2021. Google Scholar
  17. Ester Livshits, Leopoldo E. Bertossi, Benny Kimelfeld, and Moshe Sebag. The Shapley value of tuples in query answering. Log. Methods Comput. Sci., 17(3), 2021. Google Scholar
  18. Ester Livshits and Benny Kimelfeld. The Shapley value of inconsistency measures for functional dependencies. In ICDT, volume 186 of LIPIcs, pages 15:1-15:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. URL: https://doi.org/10.4230/LIPIcs.ICDT.2021.15.
  19. Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30, pages 4765-4774. Curran Associates, Inc., 2017. Google Scholar
  20. Artem Lysenko, Irina A. Roznovat, Mansoor Saqi, Alexander Mazein, Christopher J. Rawlings, and Charles Auffray. Representing and querying disease networks using graph databases. BioData Min., 9:23, 2016. URL: https://doi.org/10.1186/s13040-016-0102-8.
  21. Richard T. B. Ma, Dah-Ming Chiu, John Chi-Shing Lui, Vishal Misra, and Dan Rubenstein. Internet economics: The use of Shapley value for ISP settlement. IEEE/ACM Trans. Netw., 18(3):775-787, 2010. URL: https://doi.org/10.1109/TNET.2010.2049205.
  22. Wim Martens and Tina Trautner. Evaluation and enumeration problems for regular path queries. In ICDT, volume 98 of LIPIcs, pages 19:1-19:21. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. URL: https://doi.org/10.4230/LIPIcs.ICDT.2018.19.
  23. Wim Martens and Tina Trautner. Dichotomies for evaluating simple regular path queries. ACM Trans. Database Syst., 44(4):16:1-16:46, 2019. URL: https://doi.org/10.1145/3331446.
  24. Corto Mascle, Christel Baier, Florian Funkev, Simon Jantsch, and Stefan Kiefer. Responsibility and verification: Importance value in temporal logics. In 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 1-14. IEEE, 2021. Google Scholar
  25. Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, and Dan Suciu. The complexity of causality and responsibility for query answers and non-answers. Proc. VLDB Endow., 4(1):34-45, 2010. Google Scholar
  26. Alberto O. Mendelzon and Peter T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6):1235-1258, 1995. URL: https://doi.org/10.1137/S009753979122370X.
  27. Stefano Moretti, Fioravante Patrone, and Stefano Bonassi. The class of microarray games and the relevance index for genes. Top, 15(2):256-280, 2007. Google Scholar
  28. Ramasuri Narayanam and Yadati Narahari. A Shapley value-based approach to discover influential nodes in social networks. IEEE Trans Autom. Sci. Eng., 8(1):130-147, 2011. URL: https://doi.org/10.1109/TASE.2010.2052042.
  29. Alon Reshef, Benny Kimelfeld, and Ester Livshits. The impact of negation on the complexity of the Shapley value in conjunctive queries. In Dan Suciu, Yufei Tao, and Zhewei Wei, editors, PODS, pages 285-297. ACM, 2020. URL: https://doi.org/10.1145/3375395.3387664.
  30. Gorka Sadowski and Philip Rathle. Fraud detection: Discovering connections with graph databases. White Paper-Neo Technology-Graphs are Everywhere, 13, 2014. Google Scholar
  31. Babak Salimi, Leopoldo E. Bertossi, Dan Suciu, and Guy Van den Broeck. Quantifying causal effects on query answering in databases. In TaPP. USENIX Association, 2016. Google Scholar
  32. Lloyd S Shapley. A value for n-person games. In Harold W. Kuhn and Albert W. Tucker, editors, Contributions to the Theory of Games II, pages 307-317. Princeton University Press, Princeton, 1953. Google Scholar
  33. Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. Comput., 20(5):865-877, 1991. Google Scholar
  34. Tjeerd van Campen, Herbert Hamers, Bart Husslage, and Roy Lindelauf. A new approximation method for the Shapley value applied to the WTC 9/11 terrorist attack. Soc. Netw. Anal. Min., 8(1):3:1-3:12, 2018. URL: https://doi.org/10.1007/s13278-017-0480-z.
  35. Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In STOC, pages 137-146. ACM, 1982. URL: https://doi.org/10.1145/800070.802186.
  36. Mihalis Yannakakis. Graph-theoretic methods in database theory. In PODS, pages 230-242, 1990. URL: https://doi.org/10.1145/298514.298576.
  37. Ningning Yi, Chunfang Li, Xin Feng, and Minyong Shi. Design and implementation of movie recommender system based on graph database. In WISA, pages 132-135. IEEE, 2017. URL: https://doi.org/10.1109/WISA.2017.34.
  38. Byoung-Ha Yoon, Seon-Kyu Kim, and Seon-Young Kim. Use of graph database for the integration of heterogeneous biological data. Genomics & informatics, 15(1):19, 2017. Google Scholar
  39. Bruno Yun, Srdjan Vesic, Madalina Croitoru, and Pierre Bisquert. Inconsistency measures for repair semantics in OBDA. In IJCAI, pages 1977-1983. ijcai.org, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail