1 Search Results for "Šestáková, Eliška"


Document
Indexing XML Documents Using Tree Paths Automaton

Authors: Eliška Šestáková and Jan Janoušek

Published in: OASIcs, Volume 56, 6th Symposium on Languages, Applications and Technologies (SLATE 2017)


Abstract
An XML document can be viewed as a tree in a natural way. Processing tree data structures usually requires a pushdown automaton as a model of computation. Therefore, it is interesting that a finite automaton can be used to solve the XML index problem. In this paper, we attempt to support a significant fragment of XPath queries which may use any combination of child (i.e., /) and descendant-or-self (i.e., //) axis. A systematic approach to the construction of such XML index, which is a finite automaton called Tree Paths Automaton, is presented. Given an XML tree model T, the tree is first of all preprocessed by means of its linear fragments called string paths. Since only path queries are considered, the branching structure of the XML tree model can be omitted. For individual string paths, smaller Tree Paths Automata are built, and they are afterwards combined to form the index. The searching phase uses the index, reads an input query Q of size m, and computes the list of positions of all occurrences of Q in the tree T. The searching is performed in time O(m) and does not depend on the size of the XML document. Although the number of queries is clearly exponential in the number of nodes of the XML tree model, the size of the index seems to be, according to our experimental results, usually only about 2.5 times larger than the size of the original document.

Cite as

Eliška Šestáková and Jan Janoušek. Indexing XML Documents Using Tree Paths Automaton. In 6th Symposium on Languages, Applications and Technologies (SLATE 2017). Open Access Series in Informatics (OASIcs), Volume 56, pp. 10:1-10:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{sestakova_et_al:OASIcs.SLATE.2017.10,
  author =	{\v{S}est\'{a}kov\'{a}, Eli\v{s}ka and Janou\v{s}ek, Jan},
  title =	{{Indexing XML Documents Using Tree Paths Automaton}},
  booktitle =	{6th Symposium on Languages, Applications and Technologies (SLATE 2017)},
  pages =	{10:1--10:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-056-9},
  ISSN =	{2190-6807},
  year =	{2017},
  volume =	{56},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Leal, Jos\'{e} Paulo and Varanda, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2017.10},
  URN =		{urn:nbn:de:0030-drops-79457},
  doi =		{10.4230/OASIcs.SLATE.2017.10},
  annote =	{Keywords: XML, XPath, index, tree, finite automaton}
}
  • Refine by Author
  • 1 Janoušek, Jan
  • 1 Šestáková, Eliška

  • Refine by Classification

  • Refine by Keyword
  • 1 XML
  • 1 XPath
  • 1 finite automaton
  • 1 index
  • 1 tree

  • Refine by Type
  • 1 document

  • Refine by Publication Year
  • 1 2017

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail