Characterizing XML Twig Queries with Examples

Authors Slawek Staworko, Piotr Wieczorek

Thumbnail PDF


  • Filesize: 0.55 MB
  • 17 pages

Document Identifiers

Author Details

Slawek Staworko
Piotr Wieczorek

Cite AsGet BibTex

Slawek Staworko and Piotr Wieczorek. Characterizing XML Twig Queries with Examples. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 144-160, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Typically, a (Boolean) query is a finite formula that defines a possibly infinite set of database instances that satisfy it (positive examples), and implicitly, the set of instances that do not satisfy the query (negative examples). We investigate the following natural question: for a given class of queries, is it possible to characterize every query with a finite set of positive and negative examples that no other query is consistent with. We study this question for twig queries and XML databases. We show that while twig queries are characterizable, they generally require exponential sets of examples. Consequently, we focus on a practical subclass of anchored twig queries and show that not only are they characterizable but also with polynomially-sized sets of examples. This result is obtained with the use of generalization operations on twig queries, whose application to an anchored twig query yields a properly contained and minimally different query. Our results illustrate further interesting and strong connections between the structure and the semantics of anchored twig queries that the class of arbitrary twig queries does not enjoy. Finally, we show that the class of unions of twig queries is not characterizable.
  • Query characterization
  • Query examples
  • Query fitting
  • Twig queries


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. A. Abouzied, D. Angluin, Ch. Papadimitriou, J. M. Hellerstein, and A. Silberschatz. Learning and verifying quantified boolean queries by example. In Proceedings of the 32Nd Symposium on Principles of Database Systems, PODS '13, pages 49-60. ACM, 2013. Google Scholar
  2. S. Amer-Yahia, S. Cho, L. V. S. Lakshmanan, and D. Srivastava. Tree pattern query minimization. VLDB Journal, 11(4):315-331, 2002. Google Scholar
  3. M. Anthony, G. Brightwell, D. Cohen, and J. Shawe-Taylor. On exact specification by examples. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT '92, pages 311-318, New York, NY, USA, 1992. ACM. Google Scholar
  4. S. Cho, S. Amer-Yahia, L. V. S. Lakshmanan, and D. Srivastava. Optimizing the secure evaluation of twig queries. In International Conference on Very Large Data Bases (VLDB), pages 490-501, 2002. Google Scholar
  5. S. Cohen and Y. Y. Weiss. Certain and possible XPath answers. In International Conference on Database Theory (ICDT), 2013. Google Scholar
  6. C. de la Higuera. Characteristic sets for polynomial grammatical inference. Machine Learning, 27(2):125-138, 1997. Google Scholar
  7. E. M. Gold. Language identification in the limit. Information and Control, 10(5):447-474, 1967. Google Scholar
  8. E. M. Gold. Complexity of automaton identification from given data. Information and Control, 37(3):302 - 320, 1978. Google Scholar
  9. S. A. Goldman and M. J. Kearns. On the complexity of teaching. Journal of Computer and System Sciences, 50(1):20 - 31, 1995. Google Scholar
  10. S. A. Goldman, R. L. Rivest, and R. E. Schapire. Learning binary relations and total orders. SIAM J. Comput., 22(5):1006-1034, 1993. Google Scholar
  11. B. Kimelfeld and Y. Sagiv. Revisiting redundancy and minimization in an xpath fragment. In EDBT 2008, 11th International Conference on Extending Database Technology, pages 61-72, 2008. Google Scholar
  12. J. Michaliszyn, A. Muscholl, S. Staworko, P. Wieczorek, and Z. Wu. On injective embeddings of tree patterns. CoRR, abs/1204.4948, 2012. Google Scholar
  13. G. Miklau and D. Suciu. Containment and equivalence for a fragment of XPath. Journal of the ACM, 51(1):2-45, 2004. Google Scholar
  14. F. Neven. Automata, logic, and XML. In Workshop on Computer Science Logic (CSL), volume 2471 of Lecture Notes in Computer Science, pages 2-26. Springer, 2002. Google Scholar
  15. F. Neven and T. Schwentick. XPath containment in the presence of disjunction, DTDs, and variables. In International Conference on Database Theory (ICDT), pages 315-329. Springer-Verlag, 2003. Google Scholar
  16. S. Salzberg, A. L. Delcher, D. G. Heath, and S. Kasif. Learning with a helpful teacher. In Proceedings of the 12th International Joint Conference on Artificial Intelligence., pages 705-711, 1991. Google Scholar
  17. T. Schwentick. XPath query containment. SIGMOD Record, 33(1):101-109, 2004. Google Scholar
  18. A. Shinohara and S. Miyano. Teachability in computational learning. New Generation Comput., 8(4):337-347, 1991. Google Scholar
  19. S. Staworko and P. Wieczorek. Learning twig and path queries. In International Conference on Database Theory (ICDT), March 2012. Google Scholar
  20. B. Ten Cate, V. Dalmau, and P. Kolaitis. Learning schema mappings. In International Conference on Database Theory (ICDT), March 2012. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail