Search Results

Documents authored by Staworko, Slawek


Document
Characterizing XML Twig Queries with Examples

Authors: Slawek Staworko and Piotr Wieczorek

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)


Abstract
Typically, a (Boolean) query is a finite formula that defines a possibly infinite set of database instances that satisfy it (positive examples), and implicitly, the set of instances that do not satisfy the query (negative examples). We investigate the following natural question: for a given class of queries, is it possible to characterize every query with a finite set of positive and negative examples that no other query is consistent with. We study this question for twig queries and XML databases. We show that while twig queries are characterizable, they generally require exponential sets of examples. Consequently, we focus on a practical subclass of anchored twig queries and show that not only are they characterizable but also with polynomially-sized sets of examples. This result is obtained with the use of generalization operations on twig queries, whose application to an anchored twig query yields a properly contained and minimally different query. Our results illustrate further interesting and strong connections between the structure and the semantics of anchored twig queries that the class of arbitrary twig queries does not enjoy. Finally, we show that the class of unions of twig queries is not characterizable.

Cite as

Slawek Staworko and Piotr Wieczorek. Characterizing XML Twig Queries with Examples. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 144-160, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{staworko_et_al:LIPIcs.ICDT.2015.144,
  author =	{Staworko, Slawek and Wieczorek, Piotr},
  title =	{{Characterizing XML Twig Queries with Examples}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{144--160},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.144},
  URN =		{urn:nbn:de:0030-drops-49828},
  doi =		{10.4230/LIPIcs.ICDT.2015.144},
  annote =	{Keywords: Query characterization, Query examples, Query fitting, Twig queries}
}
Document
Complexity and Expressiveness of ShEx for RDF

Authors: Slawek Staworko, Iovka Boneva, Jose E. Labra Gayo, Samuel Hym, Eric G. Prud'hommeaux, and Harold Solbrig

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)


Abstract
We study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel schema formalism for RDF currently under development by W3C. A ShEx assigns types to the nodes of an RDF graph and allows to constrain the admissible neighborhoods of nodes of a given type with regular bag expressions (RBEs). We formalize and investigate two alternative semantics, multi- and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more expressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small (yet practical) subclass of RBEs. To curb the high computational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable.

Cite as

Slawek Staworko, Iovka Boneva, Jose E. Labra Gayo, Samuel Hym, Eric G. Prud'hommeaux, and Harold Solbrig. Complexity and Expressiveness of ShEx for RDF. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 195-211, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{staworko_et_al:LIPIcs.ICDT.2015.195,
  author =	{Staworko, Slawek and Boneva, Iovka and Labra Gayo, Jose E. and Hym, Samuel and Prud'hommeaux, Eric G. and Solbrig, Harold},
  title =	{{Complexity and Expressiveness of ShEx for RDF}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{195--211},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.195},
  URN =		{urn:nbn:de:0030-drops-49856},
  doi =		{10.4230/LIPIcs.ICDT.2015.195},
  annote =	{Keywords: RDF, Schema, Graph topology, Validation, Complexity, Expressiveness}
}
Document
Management of Inconsistencies in Data Integration

Authors: Ekaterini Ioannou and Slawek Staworko

Published in: Dagstuhl Follow-Ups, Volume 5, Data Exchange, Integration, and Streams (2013)


Abstract
Data integration aims at providing a unified view over data coming from various sources. One of the most challenging tasks for data integration is handling the inconsistencies that appear in the integrated data in an efficient and effective manner. In this chapter, we provide a survey on techniques introduced for handling inconsistencies in data integration, focusing on two groups. The first group contains techniques for computing consistent query answers, and includes mechanisms for the compact representation of repairs, query rewriting, and logic programs. The second group contains techniques focusing on the resolution of inconsistencies. This includes methodologies for computing similarity between atomic values as well as similarity between groups of data, collective techniques, scaling to large datasets, and dealing with uncertainty that is related to inconsistencies.

Cite as

Ekaterini Ioannou and Slawek Staworko. Management of Inconsistencies in Data Integration. In Data Exchange, Integration, and Streams. Dagstuhl Follow-Ups, Volume 5, pp. 217-235, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@InCollection{ioannou_et_al:DFU.Vol5.10452.217,
  author =	{Ioannou, Ekaterini and Staworko, Slawek},
  title =	{{Management of Inconsistencies in Data Integration}},
  booktitle =	{Data Exchange, Integration, and Streams},
  pages =	{217--235},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-61-3},
  ISSN =	{1868-8977},
  year =	{2013},
  volume =	{5},
  editor =	{Kolaitis, Phokion G. and Lenzerini, Maurizio and Schweikardt, Nicole},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DFU.Vol5.10452.217},
  URN =		{urn:nbn:de:0030-drops-42952},
  doi =		{10.4230/DFU.Vol5.10452.217},
  annote =	{Keywords: Data integration, Consistent query answers, Resolution of inconsistencies}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail