Learning Tree Patterns from Example Graphs

Authors Sara Cohen, Yaacov Y. Weiss



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2015.127.pdf
  • Filesize: 456 kB
  • 17 pages

Document Identifiers

Author Details

Sara Cohen
Yaacov Y. Weiss

Cite AsGet BibTex

Sara Cohen and Yaacov Y. Weiss. Learning Tree Patterns from Example Graphs. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 127-143, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)
https://doi.org/10.4230/LIPIcs.ICDT.2015.127

Abstract

This paper investigates the problem of learning tree patterns that return nodes with a given set of labels, from example graphs provided by the user. Example graphs are annotated by the user as being either positive or negative. The goal is then to determine whether there exists a tree pattern returning tuples of nodes with the given labels in each of the positive examples, but in none of the negative examples, and, furthermore, to find one such pattern if it exists. These are called the satisfiability and learning problems, respectively. This paper thoroughly investigates the satisfiability and learning problems in a variety of settings. In particular, we consider example sets that (1) may contain only positive examples, or both positive and negative examples, (2) may contain directed or undirected graphs, and (3) may have multiple occurrences of labels or be uniquely labeled (to some degree). In addition, we consider tree patterns of different types that can allow, or prohibit, wildcard labeled nodes and descendant edges. We also consider two different semantics for mapping tree patterns to graphs. The complexity of satisfiability is determined for the different combinations of settings. For cases in which satisfiability is polynomial, it is also shown that learning is polynomial (This is non-trivial as satisfying patterns may be exponential in size). Finally, the minimal learning problem, i.e., that of finding a minimal-sized satisfying pattern, is studied for cases in which satisfiability is polynomial.
Keywords
  • tree patterns
  • learning
  • examples

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Thomas Amoth, Paul Cull, and Prasad Tadepalli. On exact learning of unordered tree patterns. Machine Learning, 44:211-243, 2001. Google Scholar
  2. Dana Angluin. Negative results for equivalence queries. Machine Learning, 5(2):121-150, July 1990. Google Scholar
  3. Timos Antonopoulos, Frank Neven, and Frédéric Servais. Definability problems for graph query languages. In Proceedings of the 16th International Conference on Database Theory, pages 141-152, New York, NY, USA, 2013. ACM. Google Scholar
  4. Hiroki Arimura, Hiroki Ishizaka, and Takeshi Shinohara. Learning unions of tree patterns using queries. Theor. Comput. Sci., 185(1):47-62, 1997. Google Scholar
  5. Julien Carme, Michal Ceresna, and Max Goebel. Query-based learning of XPath expressions. In ICGI, 2006. Google Scholar
  6. Adriane Chapman and H. V. Jagadish. Why not? In SIGMOD. ACM, 2009. Google Scholar
  7. Sara Cohen and Yaacov Y. Weiss. Certain and possible XPath answers. In ICDT, 2013. Google Scholar
  8. Anish Das Sarma, Aditya Parameswaran, Hector Garcia-Molina, and Jennifer Widom. Synthesizing view definitions from data. In ICDT, 2010. Google Scholar
  9. S. E. Dreyfus and R. A. Wagner. The steiner problem in graphs. Networks, 1(3):195-207, 1971. Google Scholar
  10. Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979. Google Scholar
  11. Melanie Herschel, Mauricio A. Hernández, and Wang-Chiew Tan. Artemis: a system for analyzing missing answers. Proc. VLDB Endow., 2:1550-1553, August 2009. Google Scholar
  12. Vagelis Hristidis, Yannis Papakonstantinou, and Andrey Balmin. Keyword proximity search on XML graphs. In ICDE, 2003. Google Scholar
  13. Jiansheng Huang, Ting Chen, AnHai Doan, and Jeffrey F. Naughton. On the provenance of non-answers to queries over extracted data. PVLDB, 1(1):736-747, 2008. Google Scholar
  14. Chuntao Jiang, Frans Coenen, and Michele Zito. A survey of frequent subgraph mining algorithms. Knowledge Eng. Review, 28(1):75-105, 2013. Google Scholar
  15. Benny Kimelfeld and Phokion G. Kolaitis. The complexity of mining maximal frequent subgraphs. In PODS, 2013. Google Scholar
  16. Benny Kimelfeld and Yehoshua Sagiv. Finding and approximating top-k answers in keyword proximity search. In PODS, 2006. Google Scholar
  17. Raymond Kosala, Maurice Bruynooghe, Jan Van Den Bussche, and Hendrik Blocked. Information extraction from web documents based on local unranked tree automaton inference. In IJCAI, 2003. Google Scholar
  18. D. Kozen. Lower bounds for natural proof systems. In FOCS, 1977. Google Scholar
  19. Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, and Dan Suciu. WHY SO? or WHY NO? Functional Causality for Explaining Query Answers. In Management of Uncertain Data, 2010. Google Scholar
  20. Neeldhara Misra, Geevarghese Philip, Venkatesh Raman, Saket Saurabh, and Somnath Sikdar. FPT algorithms for connected feedback vertex set. J. Comb. Optim., 24(2):131-146, 2012. Google Scholar
  21. Rika Okada, Satoshi Matsumoto, Tomoyuki Uchida, Yusuke Suzuki, and Takayoshi Shoudai. Exact learning of finite unions of graph patterns from queries. In Algorithmic Learning Theory, LNCS, pages 298-312. Springer Berlin Heidelberg, 2007. Google Scholar
  22. Stefan Raeymaekers, Maurice Bruynooghe, and Jan Bussche. Learning (k,l)-contextual tree languages for information extraction from web pages. Machine Learning, 71(2-3):155-183, June 2008. Google Scholar
  23. Slawek Staworko and Piotr Wieczorek. Learning twig and path queries. In ICDT, 2012. Google Scholar
  24. L. J. Stockmeyer and A. R. Meyer. Word problems requiring exponential time. In STOC, 1973. Google Scholar
  25. Quoc Trung Tran and Chee-Yong Chan. How to conquer why-not questions. In SIGMOD, 2010. Google Scholar
  26. Quoc Trung Tran, Chee-Yong Chan, and Srinivasan Parthasarathy. Query by output. In SIGMOD. ACM, 2009. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail