A Practically Efficient Algorithm for Generating Answers to Keyword Search Over Data Graphs

Authors Konstantin Golenberg, Yehoshua Sagiv



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2016.23.pdf
  • Filesize: 0.52 MB
  • 17 pages

Document Identifiers

Author Details

Konstantin Golenberg
Yehoshua Sagiv

Cite As Get BibTex

Konstantin Golenberg and Yehoshua Sagiv. A Practically Efficient Algorithm for Generating Answers to Keyword Search Over Data Graphs. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 23:1-23:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.ICDT.2016.23

Abstract

In keyword search over a data graph, an answer is a non-redundant subtree that contains all the keywords of the query. A naive approach to producing all the answers by increasing height is to generalize Dijkstra's algorithm to enumerating all acyclic paths by increasing weight. The idea of freezing is introduced so that (most) non-shortest paths are generated only if they are actually needed for producing answers. The resulting algorithm for generating subtrees, called GTF, is subtle and its proof of correctness is intricate. Extensive experiments show that GTF outperforms existing systems, even ones that for efficiency's sake are incomplete (i.e., cannot produce all the answers). In particular, GTF is scalable and performs well even on large data graphs and when many answers are neede

Subject Classification

Keywords
  • Keyword search over data graphs
  • subtree enumeration by height
  • top-k answers
  • efficiency

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Gaurav Bhalotia, Arvind Hulgeri, Charuta Nakhe, Soumen Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, 2002. Google Scholar
  2. Joel Coffman and Alfred C. Weaver. An empirical performance evaluation of relational keyword search techniques. IEEE Trans. Knowl. Data Eng., 26(1):30-42, 2014. Google Scholar
  3. Konstantin Golenberg, Benny Kimelfeld, and Yehoshua Sagiv. Keyword proximity search in complex data graphs. In SIGMOD, 2008. Google Scholar
  4. Konstantin Golenberg, Benny Kimelfeld, and Yehoshua Sagiv. Optimizing and parallelizing ranked enumeration. PVLDB, 2011. Google Scholar
  5. Konstantin Golenberg and Yehoshua Sagiv. A practically efficient algorithm for generating answers to keyword search over data graphs. arXiv, 2015. URL: http://arxiv.org/abs/1512.06635.
  6. Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. BLINKS: ranked keyword searches on graphs. In SIGMOD, 2007. Google Scholar
  7. Varun Kacholia, Shashank Pandit, Soumen Chakrabarti, S. Sudarshan, Rushi Desai, and Hrishikesh Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, 2005. Google Scholar
  8. E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 1972. Google Scholar
  9. Yi Luo, Wei Wang, Xuemin Lin, Xiaofang Zhou, Jianmin Wang, and Keqiu Li. SPARK2: Top-k keyword query in relational databases. IEEE Trans. Knowl. Data Eng., 2011. Google Scholar
  10. K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 1968. Google Scholar
  11. Thanh Tran, Haofen Wang, Sebastian Rudolph, and Philipp Cimiano. Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In ICDE, 2009. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail