Reasoning About Integrity Constraints for Tree-Structured Data

Authors Wojciech Czerwinski, Claire David, Filip Murlak, Pawel Parys



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2016.20.pdf
  • Filesize: 0.58 MB
  • 18 pages

Document Identifiers

Author Details

Wojciech Czerwinski
Claire David
Filip Murlak
Pawel Parys

Cite AsGet BibTex

Wojciech Czerwinski, Claire David, Filip Murlak, and Pawel Parys. Reasoning About Integrity Constraints for Tree-Structured Data. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 20:1-20:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.ICDT.2016.20

Abstract

We study a class of integrity constraints for tree-structured data modelled as data trees, whose nodes have a label from a finite alphabet and store a data value from an infinite data domain. The constraints require each tuple of nodes selected by a conjunctive query (using navigational axes and labels) to satisfy a positive combination of equalities and a positive combination of inequalities over the stored data values. Such constraints are instances of the general framework of XML-to-relational constraints proposed recently by Niewerth and Schwentick. They cover some common classes of constraints, including W3C XML Schema key and unique constraints, as well as domain restrictions and denial constraints, but cannot express inclusion constraints, such as reference keys. Our main result is that consistency of such integrity constraints with respect to a given schema (modelled as a tree automaton) is decidable. An easy extension gives decidability for the entailment problem. Equivalently, we show that validity and containment of unions of conjunctive queries using navigational axes, labels, data equalities and inequalities is decidable, as long as none of the conjunctive queries uses both equalities and inequalities; without this restriction, both problems are known to be undecidable. In the context of XML data exchange, our result can be used to establish decidability for a consistency problem for XML schema mappings. All the decision procedures are doubly exponential, with matching lower bounds. The complexity may be lowered to singly exponential, when conjunctive queries are replaced by tree patterns, and the number of data comparisons is bounded.
Keywords
  • data trees
  • integrity constraints
  • unions of conjunctive queries
  • schema mappings
  • entailment
  • containment
  • consistency

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Marcelo Arenas, Pablo Barceló, Leonid Libkin, and Filip Murlak. Foundations of Data Exchange. Cambridge University Press, 2014. Google Scholar
  2. Marcelo Arenas, Wenfei Fan, and Leonid Libkin. On the complexity of verifying consistency of XML specifications. SIAM J. Comput., 38(3):841-880, 2008. URL: http://dx.doi.org/10.1137/050646895.
  3. Marcelo Arenas and Leonid Libkin. A normal form for XML documents. ACM Trans. Database Syst., 29:195-232, 2004. URL: http://dx.doi.org/10.1145/974750.974757.
  4. Marcelo Arenas and Leonid Libkin. XML data exchange: Consistency and query answering. J. ACM, 55(2), 2008. URL: http://dx.doi.org/10.1145/1346330.1346332.
  5. Michael Benedikt, Wenfei Fan, and Floris Geerts. XPath satisfiability in the presence of DTDs. J. ACM, 55(2), 2008. URL: http://dx.doi.org/10.1145/1346330.1346333.
  6. Henrik Björklund, Wim Martens, and Thomas Schwentick. Optimizing conjunctive queries over trees using schema information. In Proc. MFCS 2008, pages 132-143, 2008. URL: http://dx.doi.org/10.1007/978-3-540-85238-4_10.
  7. Mikołaj Bojańczyk, Filip Murlak, and Adam Witkowski. Containment of monadic datalog programs via bounded clique-width. In Proc. ICALP 2015, pages 427-439, 2015. URL: http://dx.doi.org/10.1007/978-3-662-47666-6_34.
  8. Ashok K. Chandra and Philip M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In Proc. STOC 1977, pages 77-90, 1977. URL: http://dx.doi.org/10.1145/800105.803397.
  9. Claire David, Amélie Gheerbrant, Leonid Libkin, and Wim Martens. Containment of pattern-based queries over data trees. In Proc. ICDT 2013, pages 201-212, 2013. URL: http://dx.doi.org/10.1145/2448496.2448521.
  10. Claire David, Piotr Hofman, Filip Murlak, and Michał Pilipczuk. Synthesizing transformations from XML schema mappings. In Proc. ICDT 2014, pages 61-71, 2014. URL: http://dx.doi.org/10.5441/002/icdt.2014.10.
  11. Ronald Fagin, Phokion G. Kolaitis, Renée J. Miller, and Lucian Popa. Data exchange: semantics and query answering. Theor. Comput. Sci., 336(1):89-124, 2005. URL: http://dx.doi.org/10.1016/j.tcs.2004.10.033.
  12. Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang Chiew Tan. Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst., 30(4):994-1055, 2005. URL: http://dx.doi.org/10.1145/1114244.1114249.
  13. Ronald Fagin and Moshe Y. Vardi. The theory of data dependencies - a survey. In Mathematics of Information Processing, volume 34 of Proceedings of Symposia in Applied Mathematics, pages 19-71, Providence, Rhode Island, 1986. American Mathematical Society. Google Scholar
  14. S. Gao, C. M. Sperberg-McQueen, H.S. Thompson, N. Mendelsohn, D. Beech, and M. Maloney. W3C XML Schema Definition Language (XSD) 1.1, Part 1: Structures. Technical report, World Wide Web Consortium, April 2009. URL: http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/.
  15. Tomasz Gogacz and Jerzy Marcinkowski. All-instances termination of chase is undecidable. In Proc. ICALP 2014, pages 293-304, 2014. URL: http://dx.doi.org/10.1007/978-3-662-43951-7_25.
  16. Tomasz Gogacz and Jerzy Marcinkowski. The hunt for a red spider: Conjunctive query determinacy is undecidable. In Proc. LICS 2015, pages 281-292, 2015. URL: http://dx.doi.org/10.1109/LICS.2015.35.
  17. Sven Hartmann and Sebastian Link. More functional dependencies for XML. In Proc. ADBIS 2003, pages 355-369, 2003. URL: http://dx.doi.org/10.1007/978-3-540-39403-7_27.
  18. Sven Hartmann, Sebastian Link, and Thu Trinh. Solving the implication problem for XML functional dependencies with properties. In Proc. WoLLIC 2010, pages 161-175, 2010. URL: http://dx.doi.org/10.1007/978-3-642-13824-9_14.
  19. Michael Kaminski and Nissim Francez. Finite-memory automata. Theor. Comput. Sci., 134(2):329-363, 1994. URL: http://dx.doi.org/10.1016/0304-3975(94)90242-9.
  20. Michael Kaminski and Tony Tan. Tree automata over infinite alphabets. In Arnon Avron, Nachum Dershowitz, and Alexander Rabinovich, editors, Pillars of Computer Science, Essays Dedicated to Boris (Boaz) Trakhtenbrot on the Occasion of His 85th Birthday, volume 4800 of Lecture Notes in Computer Science, pages 386-423. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-78127-1_21.
  21. Phokion G. Kolaitis. Schema mappings, data exchange, and metadata management. In Proc. PODS 2005, pages 61-75, 2005. URL: http://dx.doi.org/10.1145/1065167.1065176.
  22. Maurizio Lenzerini. Data integration: A theoretical perspective. In Proc. PODS 2002, pages 233-246, 2002. URL: http://dx.doi.org/10.1145/543613.543644.
  23. Gerome Miklau and Dan Suciu. Containment and equivalence for a fragment of XPath. J. ACM, 51(1):2-45, 2004. URL: http://dx.doi.org/10.1145/962446.962448.
  24. Frank Neven and Thomas Schwentick. On the complexity of XPath containment in the presence of disjunction, DTDs, and variables. Log. Meth. Comput. Sci., 2(3), 2006. URL: http://dx.doi.org/10.2168/LMCS-2(3:1)2006.
  25. Matthias Niewerth and Thomas Schwentick. Reasoning about XML constraints based on XML-to-relational mappings. In Proc. ICDT 2014, pages 72-83, 2014. URL: http://dx.doi.org/10.5441/002/icdt.2014.11.
  26. Moshe Y. Vardi. Fundamentals of dependency theory. In E. Borger, editor, Trends in Theoretical Computer Science, pages 171-224. Computer Science Press, 1987. Google Scholar