4 Search Results for "Long, Philip M."


Document
APPROX
Upper and Lower Bounds for Complete Linkage in General Metric Spaces

Authors: Anna Arutyunova, Anna Großwendt, Heiko Röglin, Melanie Schmidt, and Julian Wargalla

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
In a hierarchical clustering problem the task is to compute a series of mutually compatible clusterings of a finite metric space (P,dist). Starting with the clustering where every point forms its own cluster, one iteratively merges two clusters until only one cluster remains. Complete linkage is a well-known and popular algorithm to compute such clusterings: in every step it merges the two clusters whose union has the smallest radius (or diameter) among all currently possible merges. We prove that the radius (or diameter) of every k-clustering computed by complete linkage is at most by factor O(k) (or O(k²)) worse than an optimal k-clustering minimizing the radius (or diameter). Furthermore we give a negative answer to the question proposed by Dasgupta and Long [Sanjoy Dasgupta and Philip M. Long, 2005], who show a lower bound of Ω(log(k)) and ask if the approximation guarantee is in fact Θ(log(k)). We present instances where complete linkage performs poorly in the sense that the k-clustering computed by complete linkage is off by a factor of Ω(k) from an optimal solution for radius and diameter. We conclude that in general metric spaces complete linkage does not perform asymptotically better than single linkage, merging the two clusters with smallest inter-cluster distance, for which we prove an approximation guarantee of O(k).

Cite as

Anna Arutyunova, Anna Großwendt, Heiko Röglin, Melanie Schmidt, and Julian Wargalla. Upper and Lower Bounds for Complete Linkage in General Metric Spaces. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{arutyunova_et_al:LIPIcs.APPROX/RANDOM.2021.18,
  author =	{Arutyunova, Anna and Gro{\ss}wendt, Anna and R\"{o}glin, Heiko and Schmidt, Melanie and Wargalla, Julian},
  title =	{{Upper and Lower Bounds for Complete Linkage in General Metric Spaces}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.18},
  URN =		{urn:nbn:de:0030-drops-147115},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.18},
  annote =	{Keywords: Hierarchical Clustering, Complete Linkage, agglomerative Clustering, k-Center}
}
Document
Strongly Normalizing Higher-Order Relational Queries

Authors: Wilmer Ricciotti and James Cheney

Published in: LIPIcs, Volume 167, 5th International Conference on Formal Structures for Computation and Deduction (FSCD 2020)


Abstract
Language-integrated query is a powerful programming construct allowing database queries and ordinary program code to interoperate seamlessly and safely. Language-integrated query techniques rely on classical results about monadic comprehension calculi, including the conservativity theorem for nested relational calculus. Conservativity implies that query expressions can freely use nesting and unnesting, yet as long as the query result type is a flat relation, these capabilities do not lead to an increase in expressiveness over flat relational queries. Wong showed how such queries can be translated to SQL via a constructive rewriting algorithm, and Cooper and others advocated higher-order nested relational calculi as a basis for language-integrated queries in functional languages such as Links and F#. However there is no published proof of the central strong normalization property for higher-order nested relational queries: a previous proof attempt does not deal correctly with rewrite rules that duplicate subterms. This paper fills the gap in the literature, explaining the difficulty with a previous proof attempt, and showing how to extend the ⊤⊤-lifting approach of Lindley and Stark to accommodate duplicating rewrites. We also sketch how to extend the proof to a recently-introduced calculus for heterogeneous queries mixing set and multiset semantics.

Cite as

Wilmer Ricciotti and James Cheney. Strongly Normalizing Higher-Order Relational Queries. In 5th International Conference on Formal Structures for Computation and Deduction (FSCD 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 167, pp. 28:1-28:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{ricciotti_et_al:LIPIcs.FSCD.2020.28,
  author =	{Ricciotti, Wilmer and Cheney, James},
  title =	{{Strongly Normalizing Higher-Order Relational Queries}},
  booktitle =	{5th International Conference on Formal Structures for Computation and Deduction (FSCD 2020)},
  pages =	{28:1--28:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-155-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{167},
  editor =	{Ariola, Zena M.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSCD.2020.28},
  URN =		{urn:nbn:de:0030-drops-123506},
  doi =		{10.4230/LIPIcs.FSCD.2020.28},
  annote =	{Keywords: Strong normalization, ⊤⊤-lifting, Nested relational calculus, Language-integrated query}
}
Document
Density Estimation for Shift-Invariant Multidimensional Distributions

Authors: Anindya De, Philip M. Long, and Rocco A. Servedio

Published in: LIPIcs, Volume 124, 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)


Abstract
We study density estimation for classes of shift-invariant distributions over R^d. A multidimensional distribution is "shift-invariant" if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness assumptions commonly used in non-parametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay. For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that d-dimensional shift-invariant distributions which satisfy an exponential tail bound can be learned to total variation distance error epsilon using O~_d(1/ epsilon^{d+2}) examples and O~_d(1/ epsilon^{2d+2}) time. This implies that, for constant d, multivariate log-concave distributions can be learned in O~_d(1/epsilon^{2d+2}) time using O~_d(1/epsilon^{d+2}) samples, answering a question of [Diakonikolas et al., 2016]. All of our results extend to a model of noise-tolerant density estimation using Huber's contamination model, in which the target distribution to be learned is a (1-epsilon,epsilon) mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error O(epsilon) from the target distribution. We show that our general results are close to best possible by proving a simple Omega (1/epsilon^d) information-theoretic lower bound on sample complexity even for learning bounded distributions that are shift-invariant.

Cite as

Anindya De, Philip M. Long, and Rocco A. Servedio. Density Estimation for Shift-Invariant Multidimensional Distributions. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 124, pp. 28:1-28:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{de_et_al:LIPIcs.ITCS.2019.28,
  author =	{De, Anindya and Long, Philip M. and Servedio, Rocco A.},
  title =	{{Density Estimation for Shift-Invariant Multidimensional Distributions}},
  booktitle =	{10th Innovations in Theoretical Computer Science Conference (ITCS 2019)},
  pages =	{28:1--28:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-095-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{124},
  editor =	{Blum, Avrim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2019.28},
  URN =		{urn:nbn:de:0030-drops-101214},
  doi =		{10.4230/LIPIcs.ITCS.2019.28},
  annote =	{Keywords: Density estimation, unsupervised learning, log-concave distributions, non-parametrics}
}
Document
Vertex Exponential Algorithms for Connected f-Factors

Authors: Geevarghese Philip and M. S. Ramanujan

Published in: LIPIcs, Volume 29, 34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014)


Abstract
Given a graph G and a function f:V(G) -> [V(G)], an f-factor is a subgraph H of G such that deg_H(v)=f(v) for every vertex v in V(G); we say that H is a connected f-factor if, in addition, the subgraph H is connected. Tutte (1954) showed that one can check whether a given graph has a specified f-factor in polynomial time. However, detecting a connected f-factor is NP-complete, even when f is a constant function - a foremost example is the problem of checking whether a graph has a Hamiltonian cycle; here f is a function which maps every vertex to 2. The current best algorithm for this latter problem is due to Björklund (FOCS 2010), and runs in randomized O^*(1.657^n) time (the O^*() notation hides polynomial factors). This was the first superpolynomial improvement, in nearly fifty years, over the previous best algorithm of Bellman, Held and Karp (1962) which checks for a Hamiltonian cycle in deterministic O(2^n*n^2) time. In this paper we present the first vertex-exponential algorithms for the more general problem of finding a connected f-factor. Our first result is a randomized algorithm which, given a graph G on n vertices and a function f:V(G) -> [n], checks whether G has a connected f-factor in O^*(2^n) time. We then extend our result to the case when f is a mapping from V(G) to {0,1} and the degree of every vertex v in the subgraph H is required to be f(v)(mod 2). This generalizes the problem of checking whether a graph has an Eulerian subgraph; this is a connected subgraph whose degrees are all even (f(v) equiv 0). Furthermore, we show that the min-cost editing and edge-weighted versions of these problems can be solved in randomized O^*(2^n) time as long as the costs/weights are bounded polynomially in n.

Cite as

Geevarghese Philip and M. S. Ramanujan. Vertex Exponential Algorithms for Connected f-Factors. In 34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 29, pp. 61-71, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{philip_et_al:LIPIcs.FSTTCS.2014.61,
  author =	{Philip, Geevarghese and Ramanujan, M. S.},
  title =	{{Vertex Exponential Algorithms for Connected f-Factors}},
  booktitle =	{34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014)},
  pages =	{61--71},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-77-4},
  ISSN =	{1868-8969},
  year =	{2014},
  volume =	{29},
  editor =	{Raman, Venkatesh and Suresh, S. P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2014.61},
  URN =		{urn:nbn:de:0030-drops-48337},
  doi =		{10.4230/LIPIcs.FSTTCS.2014.61},
  annote =	{Keywords: Exact Exponential Time Algorithms, f-Factors}
}
  • Refine by Author
  • 1 Arutyunova, Anna
  • 1 Cheney, James
  • 1 De, Anindya
  • 1 Großwendt, Anna
  • 1 Long, Philip M.
  • Show More...

  • Refine by Classification
  • 1 Theory of computation → Facility location and clustering
  • 1 Theory of computation → Type theory
  • 1 Theory of computation → Unsupervised learning and clustering

  • Refine by Keyword
  • 1 Complete Linkage
  • 1 Density estimation
  • 1 Exact Exponential Time Algorithms
  • 1 Hierarchical Clustering
  • 1 Language-integrated query
  • Show More...

  • Refine by Type
  • 4 document

  • Refine by Publication Year
  • 1 2014
  • 1 2019
  • 1 2020
  • 1 2021

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail