1 Search Results for "Avron, Uri"


Document
On the Hardness of Category Tree Construction

Authors: Shay Gershtein, Uri Avron, Ido Guy, Tova Milo, and Slava Novgorodov

Published in: LIPIcs, Volume 220, 25th International Conference on Database Theory (ICDT 2022)


Abstract
Category trees, or taxonomies, are rooted trees where each node, called a category, corresponds to a set of related items. The construction of taxonomies has been studied in various domains, including e-commerce, document management, and question answering. Multiple algorithms for automating construction have been proposed, employing a variety of clustering approaches and crowdsourcing. However, no formal model to capture such categorization problems has been devised, and their complexity has not been studied. To address this, we propose in this work a combinatorial model that captures many practical settings and show that the aforementioned empirical approach has been warranted, as we prove strong inapproximability bounds for various problem variants and special cases when the goal is to produce a categorization of the maximum utility. In our model, the input is a set of n weighted item sets that the tree would ideally contain as categories. Each category, rather than perfectly match the corresponding input set, is allowed to exceed a given threshold for a given similarity function. The goal is to produce a tree that maximizes the total weight of the sets for which it contains a matching category. A key parameter is an upper bound on the number of categories an item may belong to, which produces the hardness of the problem, as initially each item may be contained in an arbitrary number of input sets. For this model, we prove inapproximability bounds, of order Θ̃(√n) or Θ̃(n), for various problem variants and special cases, loosely justifying the aforementioned heuristic approach. Our work includes reductions based on parameterized randomized constructions that highlight how various problem parameters and properties of the input may affect the hardness. Moreover, for the special case where the category must be identical to the corresponding input set, we devise an algorithm whose approximation guarantee depends solely on a more granular parameter, allowing improved worst-case guarantees. Finally, we also generalize our results to DAG-based and non-hierarchical categorization.

Cite as

Shay Gershtein, Uri Avron, Ido Guy, Tova Milo, and Slava Novgorodov. On the Hardness of Category Tree Construction. In 25th International Conference on Database Theory (ICDT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 220, pp. 4:1-4:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{gershtein_et_al:LIPIcs.ICDT.2022.4,
  author =	{Gershtein, Shay and Avron, Uri and Guy, Ido and Milo, Tova and Novgorodov, Slava},
  title =	{{On the Hardness of Category Tree Construction}},
  booktitle =	{25th International Conference on Database Theory (ICDT 2022)},
  pages =	{4:1--4:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-223-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{220},
  editor =	{Olteanu, Dan and Vortmeier, Nils},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2022.4},
  URN =		{urn:nbn:de:0030-drops-158785},
  doi =		{10.4230/LIPIcs.ICDT.2022.4},
  annote =	{Keywords: maximum independent set, approximation algorithms, approximation hardness bounds, taxonomy construction, category tree construction}
}
  • Refine by Author
  • 1 Avron, Uri
  • 1 Gershtein, Shay
  • 1 Guy, Ido
  • 1 Milo, Tova
  • 1 Novgorodov, Slava

  • Refine by Classification
  • 1 Theory of computation → Approximation algorithms analysis
  • 1 Theory of computation → Data structures and algorithms for data management
  • 1 Theory of computation → Problems, reductions and completeness

  • Refine by Keyword
  • 1 approximation algorithms
  • 1 approximation hardness bounds
  • 1 category tree construction
  • 1 maximum independent set
  • 1 taxonomy construction

  • Refine by Type
  • 1 document

  • Refine by Publication Year
  • 1 2022

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail