DROPS

Document

Research

Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3

Abstract

In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.12

Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better

Authors: Jacobus Conradi and Anne Driemel

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Many application areas collect unstructured trajectory data. In subtrajectory clustering, one is interested to find patterns in this data using a hybrid combination of segmentation and clustering. We analyze two variants of this problem based on the well-known SetCover and CoverageMaximization problems. In both variants the set system is induced by metric balls under the Fréchet distance centered at polygonal curves. Our algorithms focus on improving the running time of the update step of the generic greedy algorithm by means of a careful combination of sweeps through a candidate space. In the first variant, we are given a polygonal curve P of complexity n, distance threshold Δ and complexity bound 𝓁 and the goal is to identify a minimum-size set of center curves 𝒞, where each center curve is of complexity at most 𝓁 and every point p on P is covered. A point p on P is covered if it is part of a subtrajectory π_p of P such that there is a center c ∈ 𝒞 whose Fréchet distance to π_p is at most Δ. We present an approximation algorithm for this problem with a running time of 𝒪((n²𝓁 + √{k_Δ}n^{5/2})log²n), where k_Δ is the size of an optimal solution. The algorithm gives a bicriterial approximation guarantee that relaxes the Fréchet distance threshold by a constant factor and the size of the solution by a factor of 𝒪(log n). The second problem variant asks for the maximum fraction of the input curve P that can be covered using k center curves, where k ≤ n is a parameter to the algorithm. For the second problem variant, our techniques lead to an algorithm with a running time of 𝒪((k+𝓁)n²log²n) and similar approximation guarantees. Note that in both algorithms k,k_Δ ∈ O(n) and hence the running time is cubic, or better if k ≪ n.

Cite as

Jacobus Conradi and Anne Driemel. Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{conradi_et_al:LIPIcs.ESA.2025.12,
  author =	{Conradi, Jacobus and Driemel, Anne},
  title =	{{Subtrajectory Clustering and Coverage Maximization in Cubic Time, or Better}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.12},
  URN =		{urn:nbn:de:0030-drops-244806},
  doi =		{10.4230/LIPIcs.ESA.2025.12},
  annote =	{Keywords: Clustering, Set cover, Fr\'{e}chet distance, Approximation algorithms}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.3

Multidimensional Usability Assessment in Spaceflight Analog Missions

Authors: Shivang Shelat, Katherine E. Homer, John A. Karasinski, and Jessica J. Marquez

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

Software planning tools enable the self-scheduling of operational timelines during spaceflight, reducing reliance on ground support crews. Here, we assess analog crew perceptions of a planning tool’s usability across two space mission analogs with two validated questionnaires: the unidimensional System Usability Scale and the multidimensional User Experience Questionnaire. Critically, half the missions had assistive countermeasures integrated into the planning software interface whereas the other half did not. Correlation tests revealed high convergence between usability measures in the spaceflight analog setting. Group comparisons showed that the interface countermeasures enhanced several dimensions of usability, particularly for perceptions of the tool’s efficiency and dependability. These findings highlight the utility of a multidimensional approach to characterizing usability in order to capture fine-grained shifts in human-software dynamics, especially in spaceflight environments.

Cite as

Shivang Shelat, Katherine E. Homer, John A. Karasinski, and Jessica J. Marquez. Multidimensional Usability Assessment in Spaceflight Analog Missions. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 3:1-3:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{shelat_et_al:OASIcs.SpaceCHI.2025.3,
  author =	{Shelat, Shivang and Homer, Katherine E. and Karasinski, John A. and Marquez, Jessica J.},
  title =	{{Multidimensional Usability Assessment in Spaceflight Analog Missions}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{3:1--3:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.3},
  URN =		{urn:nbn:de:0030-drops-239934},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.3},
  annote =	{Keywords: space usability, crew autonomy, self-scheduling software}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.11

Mutational Signature Refitting on Sparse Pan-Cancer Data

Authors: Gal Gilad, Teresa M. Przytycka, and Roded Sharan

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

Mutational processes shape cancer genomes, leaving characteristic marks that are termed signatures. The level of activity of each such process, or its signature exposure, provides important information on the disease, improving patient stratification and the prediction of drug response. Thus, there is growing interest in developing refitting methods that decipher those exposures. Previous work in this domain was unsupervised in nature, employing algebraic decomposition and probabilistic inference methods. Here we provide a supervised approach to the problem of signature refitting and show its superiority over current methods. Our method, SuRe, leverages a neural network model to capture correlations between signature exposures in real data. We show that SuRe outperforms previous methods on sparse mutation data from tumor type specific data sets, as well as pan-cancer data sets, with an increasing advantage as the data become sparser. We further demonstrate its utility in clinical settings.

Cite as

Gal Gilad, Teresa M. Przytycka, and Roded Sharan. Mutational Signature Refitting on Sparse Pan-Cancer Data. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 11:1-11:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gilad_et_al:LIPIcs.WABI.2025.11,
  author =	{Gilad, Gal and Przytycka, Teresa M. and Sharan, Roded},
  title =	{{Mutational Signature Refitting on Sparse Pan-Cancer Data}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{11:1--11:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.11},
  URN =		{urn:nbn:de:0030-drops-239374},
  doi =		{10.4230/LIPIcs.WABI.2025.11},
  annote =	{Keywords: mutational signatures, signature refitting, cancer genomics, genomic data analysis, somatic mutations}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.133

Faster, Deterministic and Space Efficient Subtrajectory Clustering

Authors: Ivor van der Hoog, Thijs van der Horst, and Tim Ophelders

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

Given a trajectory T and a distance Δ, we wish to find a set C of curves of complexity at most 𝓁, such that we can cover T with subcurves that each are within Fréchet distance Δ to at least one curve in C. We call C an (𝓁,Δ)-clustering and aim to find an (𝓁,Δ)-clustering of minimum cardinality. This problem variant was introduced by Akitaya et al. (2021) and shown to be NP-complete. The main focus has therefore been on bicriteria approximation algorithms, allowing for the clustering to be an (𝓁, Θ(Δ))-clustering of roughly optimal size. We present algorithms that construct (𝓁,4Δ)-clusterings of 𝒪(k log n) size, where k is the size of the optimal (𝓁, Δ)-clustering. We use 𝒪(n³) space and 𝒪(k n³ log⁴ n) time. Our algorithms significantly improve upon the clustering quality (improving the approximation factor in Δ) and size (whenever 𝓁 ∈ Ω(log n / log k)). We offer deterministic running times improving known expected bounds by a factor near-linear in 𝓁. Additionally, we match the space usage of prior work, and improve it substantially, by a factor super-linear in n𝓁, when compared to deterministic results.

Cite as

Ivor van der Hoog, Thijs van der Horst, and Tim Ophelders. Faster, Deterministic and Space Efficient Subtrajectory Clustering. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 133:1-133:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{vanderhoog_et_al:LIPIcs.ICALP.2025.133,
  author =	{van der Hoog, Ivor and van der Horst, Thijs and Ophelders, Tim},
  title =	{{Faster, Deterministic and Space Efficient Subtrajectory Clustering}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{133:1--133:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.133},
  URN =		{urn:nbn:de:0030-drops-235109},
  doi =		{10.4230/LIPIcs.ICALP.2025.133},
  annote =	{Keywords: Fr\'{e}chet distance, clustering, set cover}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.36

Faster Fréchet Distance Under Transformations

Authors: Kevin Buchin, Maike Buchin, Zijin Huang, André Nusser, and Sampson Wong

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

We study the problem of computing the Fréchet distance between two polygonal curves under transformations. First, we consider translations in the Euclidean plane. Given two curves π and σ of total complexity n and a threshold δ ≥ 0, we present an 𝒪̃(n^{7 + 1/3}) time algorithm to determine whether there exists a translation t ∈ ℝ² such that the Fréchet distance between π and σ + t is at most δ. This improves on the previous best result, which is an 𝒪(n⁸) time algorithm. We then generalize this result to any class of rationally parameterized transformations, which includes translation, rotation, scaling, and arbitrary affine transformations. For a class T of rationally parametrized transformations with k degrees of freedom, we show that one can determine whether there is a transformation τ ∈ T such that the Fréchet distance between π and τ(σ) is at most δ in 𝒪̃(n^{3k+4/3}) time.

Cite as

Kevin Buchin, Maike Buchin, Zijin Huang, André Nusser, and Sampson Wong. Faster Fréchet Distance Under Transformations. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 36:1-36:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{buchin_et_al:LIPIcs.ICALP.2025.36,
  author =	{Buchin, Kevin and Buchin, Maike and Huang, Zijin and Nusser, Andr\'{e} and Wong, Sampson},
  title =	{{Faster Fr\'{e}chet Distance Under Transformations}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{36:1--36:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.36},
  URN =		{urn:nbn:de:0030-drops-234137},
  doi =		{10.4230/LIPIcs.ICALP.2025.36},
  annote =	{Keywords: Fr\'{e}chet distance, curve similarity, shape matching}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2025.6

A Sparse Multicover Bifiltration of Linear Size

Authors: Ángel Javier Alonso

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)

Abstract

The k-cover of a point cloud X of ℝ^d at radius r is the set of all those points within distance r of at least k points of X. By varying r and k we obtain a two-parameter filtration known as the multicover bifiltration. This bifiltration has received attention recently due to being choice-free and robust to outliers. However, it is hard to compute: the smallest known equivalent simplicial bifiltration has O(|X|^{d+1}) simplices. In this paper we introduce a (1+ε)-approximation of the multicover bifiltration of linear size O(|X|), for fixed d and ε. The methods also apply to the subdivision Rips bifiltration on metric spaces of bounded doubling dimension yielding analogous results.

Cite as

Ángel Javier Alonso. A Sparse Multicover Bifiltration of Linear Size. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 6:1-6:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{alonso:LIPIcs.SoCG.2025.6,
  author =	{Alonso, \'{A}ngel Javier},
  title =	{{A Sparse Multicover Bifiltration of Linear Size}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{6:1--6:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.6},
  URN =		{urn:nbn:de:0030-drops-231587},
  doi =		{10.4230/LIPIcs.SoCG.2025.6},
  annote =	{Keywords: Multicover, Approximation, Sparsification, Multiparameter persistence}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2025.5

Optimal Motion Planning for Two Square Robots in a Rectilinear Environment

Authors: Pankaj K. Agarwal, Mark de Berg, Benjamin Holmgren, Alex Steiger, and Martijn Struijs

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)

Abstract

Let W ⊂ ℝ² be a rectilinear polygonal environment (that is, a rectilinear polygon potentially with holes) with a total of n vertices, and let A,B be two robots, each modeled as an axis-aligned unit square, that can move rectilinearly inside W. The goal is to compute an optimal collision-free motion plan π for A and B between a given pair of source and target configurations. We study two variants of this problem and obtain the following results. - Min-Sum: Here the goal is to compute a motion plan that minimizes the sum of the lengths of the paths of the robots. We present an O(n⁴log n)-time algorithm for computing an optimal solution to the min-sum problem. This is the first polynomial-time algorithm to compute an optimal, collision-free motion of two robots amid obstacles in a planar polygonal environment. - Min-Makespan: Here the robots can move with at most unit speed, and the goal is to compute a motion plan that minimizes the maximum time taken by a robot to reach its target location. We prove that the min-makespan variant is NP-hard.

Cite as

Pankaj K. Agarwal, Mark de Berg, Benjamin Holmgren, Alex Steiger, and Martijn Struijs. Optimal Motion Planning for Two Square Robots in a Rectilinear Environment. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 5:1-5:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{agarwal_et_al:LIPIcs.SoCG.2025.5,
  author =	{Agarwal, Pankaj K. and de Berg, Mark and Holmgren, Benjamin and Steiger, Alex and Struijs, Martijn},
  title =	{{Optimal Motion Planning for Two Square Robots in a Rectilinear Environment}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{5:1--5:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.5},
  URN =		{urn:nbn:de:0030-drops-231577},
  doi =		{10.4230/LIPIcs.SoCG.2025.5},
  annote =	{Keywords: Computational geometry, motion planning, multiple robots, rectilinear paths}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2025.78

Efficient Greedy Discrete Subtrajectory Clustering

Authors: Ivor van der Hoog, Lara Ost, Eva Rotenberg, and Daniel Rutschmann

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)

Abstract

We cluster a set of trajectories 𝒯 using subtrajectories of 𝒯. We require for a clustering C that any two subtrajectories (𝒯[a, b], 𝒯[c, d]) in a cluster have disjoint intervals [a,b] and [c, d]. Clustering quality may be measured by the number of clusters, the number of vertices of 𝒯 that are absent from the clustering, and by the Fréchet distance between subtrajectories in a cluster. A Δ-cluster of 𝒯 is a cluster 𝒫 of subtrajectories of 𝒯 with a centre P ∈ 𝒫, where all subtrajectories in 𝒫 have Fréchet distance at most Δ to P. Buchin, Buchin, Gudmundsson, Löffler and Luo present two O(n² + n m 𝓁)-time algorithms: SC(max, 𝓁, Δ, 𝒯) computes a single Δ-cluster where P has at least 𝓁 vertices and maximises the cardinality m of 𝒫. SC(m, max, Δ, 𝒯) computes a single Δ-cluster where 𝒫 has cardinality m and maximises the complexity 𝓁 of P. In this paper, which is a mixture of algorithms engineering and theoretical insights, we use such maximum-cardinality clusters in a greedy clustering algorithm. We first provide an efficient implementation of SC(max, 𝓁, Δ, 𝒯) and SC(m, max, Δ, 𝒯) that significantly outperforms previous implementations. Next, we use these functions as a subroutine in a greedy clustering algorithm, which performs well when compared to existing subtrajectory clustering algorithms on real-world data. Finally, we observe that, for fixed Δ and 𝒯, these two functions always output a point on the Pareto front of some bivariate function θ(𝓁, m). We design a new algorithm PSC(Δ, 𝒯) that in O(n² log⁴ n) time computes a 2-approximation of this Pareto front. This yields a broader set of candidate clusters, with comparable quality to the output of the previous functions. We show that using PSC(Δ, 𝒯) as a subroutine improves the clustering quality and performance even further.

Cite as

Ivor van der Hoog, Lara Ost, Eva Rotenberg, and Daniel Rutschmann. Efficient Greedy Discrete Subtrajectory Clustering. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 78:1-78:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{vanderhoog_et_al:LIPIcs.SoCG.2025.78,
  author =	{van der Hoog, Ivor and Ost, Lara and Rotenberg, Eva and Rutschmann, Daniel},
  title =	{{Efficient Greedy Discrete Subtrajectory Clustering}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{78:1--78:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.78},
  URN =		{urn:nbn:de:0030-drops-232308},
  doi =		{10.4230/LIPIcs.SoCG.2025.78},
  annote =	{Keywords: Algorithms engineering, Fr\'{e}chet distance, subtrajectory clustering}
}

Document

Resource Paper

DOI: 10.4230/TGDK.2.2.7

Whelk: An OWL EL+RL Reasoner Enabling New Use Cases

Authors: James P. Balhoff and Christopher J. Mungall

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2

Abstract

Many tasks in the biosciences rely on reasoning with large OWL terminologies (Tboxes), often combined with even larger databases. In particular, a common task is retrieval queries that utilize relational expressions; for example, “find all genes expressed in the brain or any part of the brain”. Automated reasoning on these ontologies typically relies on scalable reasoners targeting the EL subset of OWL, such as ELK. While the introduction of ELK has been transformative in the incorporation of reasoning into bio-ontology quality control and production pipelines, we have encountered limitations when applying it to use cases involving high throughput query answering or reasoning about datasets describing instances (Aboxes). Whelk is a fast OWL reasoner for combined EL+RL reasoning. As such, it is particularly useful for many biological ontology tasks, particularly those characterized by large Tboxes using the EL subset of OWL, combined with Aboxes targeting the RL subset of OWL. Whelk is implemented in Scala and utilizes immutable functional data structures, which provides advantages when performing incremental or dynamic reasoning tasks. Whelk supports querying complex class expressions at a substantially greater rate than ELK, and can answer queries or perform incremental reasoning tasks in parallel, enabling novel applications of OWL reasoning.

Cite as

James P. Balhoff and Christopher J. Mungall. Whelk: An OWL EL+RL Reasoner Enabling New Use Cases. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{balhoff_et_al:TGDK.2.2.7,
  author =	{Balhoff, James P. and Mungall, Christopher J.},
  title =	{{Whelk: An OWL EL+RL Reasoner Enabling New Use Cases}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{7:1--7:17},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.7},
  URN =		{urn:nbn:de:0030-drops-225918},
  doi =		{10.4230/TGDK.2.2.7},
  annote =	{Keywords: Web Ontology Language, OWL, Semantic Web, ontology, reasoner}
}

Document

Resource Paper

DOI: 10.4230/TGDK.2.2.4

FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset

Authors: Sheeba Samuel and Daniel Mietchen

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2

Abstract

The way in which data are shared can affect their utility and reusability. Here, we demonstrate how data that we had previously shared in bulk can be mobilized further through a knowledge graph that allows for much more granular exploration and interrogation. The original dataset is about the computational reproducibility of GitHub-hosted Jupyter notebooks associated with biomedical publications. It contains rich metadata about the publications, associated GitHub repositories and Jupyter notebooks, and the notebooks' reproducibility. We took this dataset, converted it into semantic triples and loaded these into a triple store to create a knowledge graph - FAIR Jupyter - that we made accessible via a web service. This enables granular data exploration and analysis through queries that can be tailored to specific use cases. Such queries may provide details about any of the variables from the original dataset, highlight relationships between them or combine some of the graph’s content with materials from corresponding external resources. We provide a collection of example queries addressing a range of use cases in research and education. We also outline how sets of such queries can be used to profile specific content types, either individually or by class. We conclude by discussing how such a semantically enhanced sharing of complex datasets can both enhance their FAIRness - i.e., their findability, accessibility, interoperability, and reusability - and help identify and communicate best practices, particularly with regards to data quality, standardization, automation and reproducibility.

Cite as

Sheeba Samuel and Daniel Mietchen. FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 4:1-4:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{samuel_et_al:TGDK.2.2.4,
  author =	{Samuel, Sheeba and Mietchen, Daniel},
  title =	{{FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:24},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.4},
  URN =		{urn:nbn:de:0030-drops-225886},
  doi =		{10.4230/TGDK.2.2.4},
  annote =	{Keywords: Knowledge Graph, Computational reproducibility, Jupyter notebooks, FAIR data, PubMed Central, GitHub, Python, SPARQL}
}

Document

APPROX

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.2

Probabilistic Metric Embedding via Metric Labeling

Authors: Kamesh Munagala, Govind S. Sankar, and Erin Taylor

Published in: LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

Abstract

We consider probabilistic embedding of metric spaces into ultra-metrics (or equivalently to a constant factor, into hierarchically separated trees) to minimize the expected distortion of any pairwise distance. Such embeddings have been widely used in network design and online algorithms. Our main result is a polynomial time algorithm that approximates the optimal distortion on any instance to within a constant factor. We achieve this via a novel LP formulation that reduces this problem to a probabilistic version of uniform metric labeling.

Cite as

Kamesh Munagala, Govind S. Sankar, and Erin Taylor. Probabilistic Metric Embedding via Metric Labeling. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 2:1-2:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{munagala_et_al:LIPIcs.APPROX/RANDOM.2023.2,
  author =	{Munagala, Kamesh and Sankar, Govind S. and Taylor, Erin},
  title =	{{Probabilistic Metric Embedding via Metric Labeling}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{2:1--2:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.2},
  URN =		{urn:nbn:de:0030-drops-188279},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.2},
  annote =	{Keywords: Metric Embedding, Approximation Algorithms, Ultrametrics}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2022.35

Multi-Robot Motion Planning for Unit Discs with Revolving Areas

Authors: Pankaj K. Agarwal, Tzvika Geft, Dan Halperin, and Erin Taylor

Published in: LIPIcs, Volume 248, 33rd International Symposium on Algorithms and Computation (ISAAC 2022)

Abstract

We study the problem of motion planning for a collection of n labeled unit disc robots in a polygonal environment. We assume that the robots have revolving areas around their start and final positions: that each start and each final is contained in a radius 2 disc lying in the free space, not necessarily concentric with the start or final position, which is free from other start or final positions. This assumption allows a weakly-monotone motion plan, in which robots move according to an ordering as follows: during the turn of a robot R in the ordering, it moves fully from its start to final position, while other robots do not leave their revolving areas. As R passes through a revolving area, a robot R' that is inside this area may move within the revolving area to avoid a collision. Notwithstanding the existence of a motion plan, we show that minimizing the total traveled distance in this setting, specifically even when the motion plan is restricted to be weakly-monotone, is APX-hard, ruling out any polynomial-time (1+ε)-approximation algorithm. On the positive side, we present the first constant-factor approximation algorithm for computing a feasible weakly-monotone motion plan. The total distance traveled by the robots is within an O(1) factor of that of the optimal motion plan, which need not be weakly monotone. Our algorithm extends to an online setting in which the polygonal environment is fixed but the initial and final positions of robots are specified in an online manner. Finally, we observe that the overhead in the overall cost that we add while editing the paths to avoid robot-robot collision can vary significantly depending on the ordering we chose. Finding the best ordering in this respect is known to be NP-hard, and we provide a polynomial time O(log n log log n)-approximation algorithm for this problem.

Cite as

Pankaj K. Agarwal, Tzvika Geft, Dan Halperin, and Erin Taylor. Multi-Robot Motion Planning for Unit Discs with Revolving Areas. In 33rd International Symposium on Algorithms and Computation (ISAAC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 248, pp. 35:1-35:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{agarwal_et_al:LIPIcs.ISAAC.2022.35,
  author =	{Agarwal, Pankaj K. and Geft, Tzvika and Halperin, Dan and Taylor, Erin},
  title =	{{Multi-Robot Motion Planning for Unit Discs with Revolving Areas}},
  booktitle =	{33rd International Symposium on Algorithms and Computation (ISAAC 2022)},
  pages =	{35:1--35:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-258-7},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{248},
  editor =	{Bae, Sang Won and Park, Heejin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2022.35},
  URN =		{urn:nbn:de:0030-drops-173204},
  doi =		{10.4230/LIPIcs.ISAAC.2022.35},
  annote =	{Keywords: motion planning, optimal motion planning, approximation, complexity, NP-hardness}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2020.8

Clustering Under Perturbation Stability in Near-Linear Time

Authors: Pankaj K. Agarwal, Hsien-Chih Chang, Kamesh Munagala, Erin Taylor, and Emo Welzl

Published in: LIPIcs, Volume 182, 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)

Abstract

We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is α-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most α. Our main contribution is in presenting efficient exact algorithms for α-stable clustering instances whose running times depend near-linearly on the size of the data set when α ≥ 2 + √3. For k-center and k-means problems, our algorithms also achieve polynomial dependence on the number of clusters, k, when α ≥ 2 + √3 + ε for any constant ε > 0 in any fixed dimension. For k-median, our algorithms have polynomial dependence on k for α > 5 in any fixed dimension; and for α ≥ 2 + √3 in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.

Cite as

Pankaj K. Agarwal, Hsien-Chih Chang, Kamesh Munagala, Erin Taylor, and Emo Welzl. Clustering Under Perturbation Stability in Near-Linear Time. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 182, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{agarwal_et_al:LIPIcs.FSTTCS.2020.8,
  author =	{Agarwal, Pankaj K. and Chang, Hsien-Chih and Munagala, Kamesh and Taylor, Erin and Welzl, Emo},
  title =	{{Clustering Under Perturbation Stability in Near-Linear Time}},
  booktitle =	{40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-174-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{182},
  editor =	{Saxena, Nitin and Simon, Sunil},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2020.8},
  URN =		{urn:nbn:de:0030-drops-132492},
  doi =		{10.4230/LIPIcs.FSTTCS.2020.8},
  annote =	{Keywords: clustering, stability, local search, dynamic programming, coreset, polyhedral metric, trapezoid decomposition, range query}
}

@InProceedings{agarwal_et_al:LIPIcs.FSTTCS.2020.8,
  author =	{Agarwal, Pankaj K. and Chang, Hsien-Chih and Munagala, Kamesh and Taylor, Erin and Welzl, Emo},
  title =	{{Clustering Under Perturbation Stability in Near-Linear Time}},
  booktitle =	{40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-174-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{182},
  editor =	{Saxena, Nitin and Simon, Sunil},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2020.8},
  URN =		{urn:nbn:de:0030-drops-132492},
  doi =		{10.4230/LIPIcs.FSTTCS.2020.8},
  annote =	{Keywords: clustering, stability, local search, dynamic programming, coreset, polyhedral metric, trapezoid decomposition, range query}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2020.58

k-Median Clustering Under Discrete Fréchet and Hausdorff Distances

Authors: Abhinandan Nath and Erin Taylor

Published in: LIPIcs, Volume 164, 36th International Symposium on Computational Geometry (SoCG 2020)

Abstract

We give the first near-linear time (1+ε)-approximation algorithm for k-median clustering of polygonal trajectories under the discrete Fréchet distance, and the first polynomial time (1+ε)-approximation algorithm for k-median clustering of finite point sets under the Hausdorff distance, provided the cluster centers, ambient dimension, and k are bounded by a constant. The main technique is a general framework for solving clustering problems where the cluster centers are restricted to come from a simpler metric space. We precisely characterize conditions on the simpler metric space of the cluster centers that allow faster (1+ε)-approximations for the k-median problem. We also show that the k-median problem under Hausdorff distance is NP-Hard.

Cite as

Abhinandan Nath and Erin Taylor. k-Median Clustering Under Discrete Fréchet and Hausdorff Distances. In 36th International Symposium on Computational Geometry (SoCG 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 164, pp. 58:1-58:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{nath_et_al:LIPIcs.SoCG.2020.58,
  author =	{Nath, Abhinandan and Taylor, Erin},
  title =	{{k-Median Clustering Under Discrete Fr\'{e}chet and Hausdorff Distances}},
  booktitle =	{36th International Symposium on Computational Geometry (SoCG 2020)},
  pages =	{58:1--58:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-143-6},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{164},
  editor =	{Cabello, Sergio and Chen, Danny Z.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2020.58},
  URN =		{urn:nbn:de:0030-drops-122161},
  doi =		{10.4230/LIPIcs.SoCG.2020.58},
  annote =	{Keywords: Clustering, k-median, trajectories, point sets, discrete Fr\'{e}chet distance, Hausdorff distance}
}

15 Search Results for "Taylor, Erin"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message