The Complexity of Reverse Engineering Problems for Conjunctive Queries

Authors Pablo Barceló, Miguel Romero

Thumbnail PDF


  • Filesize: 0.96 MB
  • 17 pages

Document Identifiers

Author Details

Pablo Barceló
Miguel Romero

Cite AsGet BibTex

Pablo Barceló and Miguel Romero. The Complexity of Reverse Engineering Problems for Conjunctive Queries. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Reverse engineering problems for conjunctive queries (CQs), such as query by example (QBE) or definability, take a set of user examples and convert them into an explanatory CQ. Despite their importance, the complexity of these problems is prohibitively high (coNEXPTIME-complete). We isolate their two main sources of complexity and propose relaxations of them that reduce the complexity while having meaningful theoretical interpretations. The first relaxation is based on the idea of using existential pebble games for approximating homomorphism tests. We show that this characterizes QBE/definability for CQs up to treewidth k, while reducing the complexity to EXPTIME. As a side result, we obtain that the complexity of the QBE/definability problems for CQs of treewidth k is EXPTIME-complete for each k > 1. The second relaxation is based on the idea of "desynchronizing" direct products, which characterizes QBE/definability for unions of CQs and reduces the complexity to coNP. The combination of these two relaxations yields tractability for QBE and characterizes it in terms of unions of CQs of treewidth at most k. We also study the complexity of these problems for conjunctive regular path queries over graph databases, showing them to be no more difficult than for CQs.
  • reverse engineering
  • conjunctive queries
  • query by example
  • definability
  • treewidth
  • complexity of pebble games


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Timos Antonopoulos, Frank Neven, and Frédéric Servais. Definability problems for graph query languages. In ICDT, pages 141-152, 2013. Google Scholar
  2. Marcelo Arenas and Gonzalo I. Díaz. The exact complexity of the first-order logic definability problem. ACM Trans. Database Syst., 41(2), to 2016. Google Scholar
  3. Marcelo Arenas, Gonzalo I. Díaz, and Egor V. Kostylev. Reverse engineering sparql queries. In WWW, 2016. Google Scholar
  4. Albert Atserias, Phokion G. Kolaitis, and Moshe Y. Vardi. Constraint propagation as a proof system. In CP, pages 77-91, 2004. Google Scholar
  5. Pablo Barceló. Querying graph databases. In PODS, pages 175-188, 2013. Google Scholar
  6. Angela Bonifati, Radu Ciucanu, and Aurélien Lemay. Learning path queries on graph databases. In EDBT, pages 109-120, 2015. Google Scholar
  7. Angela Bonifati, Radu Ciucanu, and Slawek Staworko. Learning join queries from user examples. ACM Trans. Database Syst., 40(4):24, 2016. Google Scholar
  8. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR, pages 176-185, 2000. Google Scholar
  9. Chandra Chekuri and Anand Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211-229, 2000. Google Scholar
  10. Sara Cohen and Yaacov Y. Weiss. Learning tree patterns from example graphs. In ICDT, pages 127-143, 2015. Google Scholar
  11. Mariano P. Consens and Alberto O. Mendelzon. Graphlog: a visual formalism for real life recursion. In PODS, pages 404-416, 1990. Google Scholar
  12. Víctor Dalmau, Phokion G. Kolaitis, and Moshe Y. Vardi. Constraint satisfaction, bounded treewidth, and finite-variable logics. In CP, pages 310-326, 2002. Google Scholar
  13. Rina Dechter. From local to global consistency. Artif. Intell., 55(1):87-108, 1992. Google Scholar
  14. Reinhard Diestel. Graph Theory, 4th Edition, volume 173 of Graduate texts in mathematics. Springer, 2012. Google Scholar
  15. Martin Grohe. Equivalence in finite-variable logics is complete for polynomial time. Combinatorica, 19(4):507-532, 1999. Google Scholar
  16. Phokion G. Kolaitis and Jonathan Panttaja. On the complexity of existential pebble games. In CSL, pages 314-329, 2003. Google Scholar
  17. Phokion G. Kolaitis and Moshe Y. Vardi. On the expressive power of datalog: Tools and a case study. J. Comput. Syst. Sci., 51(1):110-134, 1995. Google Scholar
  18. Phokion G. Kolaitis and Moshe Y. Vardi. A game-theoretic approach to constraint satisfaction. In AAAI, pages 175-181, 2000. Google Scholar
  19. Hao Li, Chee-Yong Chan, and David Maier. Query from examples: An iterative, data-driven approach to query construction. PVLDB, 8(13):2158-2169, 2015. Google Scholar
  20. Slawek Staworko and Piotr Wieczorek. Characterizing XML twig queries with examples. In ICDT, pages 144-160, 2015. Google Scholar
  21. Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time: Preliminary report. In STOC, pages 1-9, 1973. Google Scholar
  22. Balder ten Cate and Víctor Dalmau. The product homomorphism problem and applications. In ICDT, pages 161-176, 2015. Google Scholar
  23. Quoc Trung Tran, Chee Yong Chan, and Srinivasan Parthasarathy. Query reverse engineering. VLDB J., 23(5):721-746, 2014. Google Scholar
  24. Ross Willard. Testing expressibility is hard. In CP, pages 9-23, 2010. Google Scholar
  25. Peter T. Wood. Query languages for graph databases. SIGMOD Record, 41(1):50-60, 2012. Google Scholar
  26. Meihui Zhang, Hazem Elmeleegy, Cecilia M. Procopiuc, and Divesh Srivastava. Reverse engineering complex join queries. In SIGMOD, pages 809-820, 2013. Google Scholar