26 Search Results for "Gu, Yan"


Document
Parallel Joinable B-Trees in the Fork-Join I/O Model

Authors: Michael T. Goodrich, Yan Gu, Ryuto Kitagawa, and Yihan Sun

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)


Abstract
Balanced search trees are widely used in computer science to efficiently maintain dynamic ordered data. To support efficient set operations (e.g., union, intersection, difference) using trees, the join-based framework is widely studied. This framework has received particular attention in the parallel setting, and has been shown to be effective in enabling simple and theoretically efficient set operations on trees. Despite the widespread adoption of parallel join-based trees, a major drawback of previous work on such data structures is the inefficiency of their input/output (I/O) access patterns. Some recent work (e.g., C-trees and PaC-trees) focused on more I/O-friendly implementations of these algorithms. Surprisingly, however, there have been no results on bounding the I/O-costs for these algorithms. It remains open whether these algorithms can provide tight, provable guarantees in I/O-costs on trees. This paper studies efficient parallel algorithms for set operations based on search tree algorithms using a join-based framework, with a special focus on achieving I/O efficiency in these algorithms. To better capture the I/O-efficiency in these algorithms in parallel, we introduce a new computational model, the Fork-Join I/O Model, to measure the I/O costs in fork-join parallelism. This model measures the total block transfers (I/O work) and their critical path (I/O span). Under this model, we propose our new solution based on B-trees. Our parallel algorithm computes the union, intersection, and difference of two B-trees with O(m log_B(n/m)) I/O work and O(log_B m ⋅ log₂ log_B n + log_B n) I/O span, where n and m ≤ n are the sizes of the two trees, and B is the block size.

Cite as

Michael T. Goodrich, Yan Gu, Ryuto Kitagawa, and Yihan Sun. Parallel Joinable B-Trees in the Fork-Join I/O Model. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 37:1-37:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{goodrich_et_al:LIPIcs.ISAAC.2025.37,
  author =	{Goodrich, Michael T. and Gu, Yan and Kitagawa, Ryuto and Sun, Yihan},
  title =	{{Parallel Joinable B-Trees in the Fork-Join I/O Model}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{37:1--37:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.37},
  URN =		{urn:nbn:de:0030-drops-249451},
  doi =		{10.4230/LIPIcs.ISAAC.2025.37},
  annote =	{Keywords: Parallel algorithm, I/O efficiency, search trees, B-trees}
}
Document
Research
GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}
Document
Extended Abstract
Towards a Java Virtual Machine for Processing-In-Memory (Extended Abstract)

Authors: Kazuki Ichinose, Shigeyuki Sato, and Tomoharu Ugawa

Published in: OASIcs, Volume 134, Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)


Abstract
Processing-in-Memory (PIM) is a computing paradigm in which computation takes place in or near memory devices, offering high-bandwidth yet energy-efficient data-parallel processing. Real-world PIM systems have recently emerged, and SPMD-style programming in C is supported there. However, high-level object-oriented programming in managed languages has never been studied. Pursuing high-level programming for offloading Java applications to PIM processors, we are developing a Java framework to support it. As a status report on our project, we present our prototype Java VM built upon a real-world PIM system and experimentally demonstrate its scalability. The experimental results showed the potential of our Java VM on the PIM system with thousands of PIM processors.

Cite as

Kazuki Ichinose, Shigeyuki Sato, and Tomoharu Ugawa. Towards a Java Virtual Machine for Processing-In-Memory (Extended Abstract). In Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025). Open Access Series in Informatics (OASIcs), Volume 134, pp. 2:1-2:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{ichinose_et_al:OASIcs.Programming.2025.2,
  author =	{Ichinose, Kazuki and Sato, Shigeyuki and Ugawa, Tomoharu},
  title =	{{Towards a Java Virtual Machine for Processing-In-Memory}},
  booktitle =	{Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)},
  pages =	{2:1--2:5},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-382-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{134},
  editor =	{Edwards, Jonathan and Perera, Roly and Petricek, Tomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Programming.2025.2},
  URN =		{urn:nbn:de:0030-drops-242862},
  doi =		{10.4230/OASIcs.Programming.2025.2},
  annote =	{Keywords: Java VM, Processing-in-Memory, Offloading, Data parallelism}
}
Document
B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load

Authors: Roodabeh Safavi and Martin P. Seybold

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)


Abstract
Uniquely represented (UR) data structures represent each logical state with a unique storage state. We study the problem of maintaining a dynamic set of n keys from a totally ordered universe in this context. UR structures are also called "strongly history independent" structures in the literature. We introduce a two-layer data structure called (α,ε)-Randomized Block Search Tree (RBST) that is uniquely represented and suitable for external memory (EM). Though RBSTs naturally generalize the well-known binary Treaps, several new ideas are needed to analyze the expected search, update, and storage efficiency in terms of block-reads, block-writes, and blocks stored. We prove that searches have O(ε^{-1} + log_α n) block-reads, that dynamic updates perform O(ε^{-1} + log_α(n)/α) block-writes and O(ε^{-2}+(1+(ε^{-1}+log n)/α)log_α n) block-reads, and that (α, ε)-RBSTs have an asymptotic load-factor of at least (1-ε) for every ε ∈ (0,1/2]. Thus (α, ε)-RBSTs improve on the known, uniquely represented B-Treap [Golovin; ICALP'09]. Compared with non-UR structures, the RBST is also, to the best of our knowledge, the first external memory structure that is storage-efficient and has a non-amortized, write-efficient update bound.

Cite as

Roodabeh Safavi and Martin P. Seybold. B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 47:1-47:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{safavi_et_al:LIPIcs.WADS.2025.47,
  author =	{Safavi, Roodabeh and Seybold, Martin P.},
  title =	{{B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{47:1--47:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.47},
  URN =		{urn:nbn:de:0030-drops-242786},
  doi =		{10.4230/LIPIcs.WADS.2025.47},
  annote =	{Keywords: Unique Representation, Randomization, Top-Down Analysis, Block Search Tree, Write-Efficiency, Storage-Efficiency}
}
Document
On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms

Authors: Lorenzo De Stefani and Vedant Gupta

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)


Abstract
Asymptotically tight lower bounds are derived for the Input/Output (I/O) complexity of a class of dynamic programming algorithms, including matrix chain multiplication, optimal polygon triangulation, and the construction of optimal binary search trees. Assuming no recomputation of intermediate values, we establish an Ω(n³/(√M B)) I/O lower bound, where n denotes the size of the input and M denotes the size of the available fast memory (cache). When recomputation is allowed, we show that the same bound holds for M < cn, where c is a positive constant. In the case where M ≥ 2n, we show an Ω(n/B) I/O lower bound. We also discuss algorithms for which the number of executed I/O operations matches asymptotically each of the presented lower bounds, which are thus asymptotically tight. Additionally, we refine our general method to obtain a lower bound for the I/O complexity of the Cocke-Younger-Kasami algorithm, where the size of the grammar impacts the I/O complexity. An upper bound with asymptotically matching performance in many cases is also provided.

Cite as

Lorenzo De Stefani and Vedant Gupta. On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 49:1-49:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{destefani_et_al:LIPIcs.WADS.2025.49,
  author =	{De Stefani, Lorenzo and Gupta, Vedant},
  title =	{{On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{49:1--49:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.49},
  URN =		{urn:nbn:de:0030-drops-242800},
  doi =		{10.4230/LIPIcs.WADS.2025.49},
  annote =	{Keywords: I/O complexity, Dynamic Programming Algorithms, Lower Bounds, Recomputation, Cocke-Younger-Kasami}
}
Document
Optimal Concolic Dynamic Partial Order Reduction

Authors: Mohammad Hossein Khoshechin Jorshari, Michalis Kokologiannakis, Rupak Majumdar, and Srinidhi Nagendra

Published in: LIPIcs, Volume 348, 36th International Conference on Concurrency Theory (CONCUR 2025)


Abstract
Stateless model checking (SMC) software implementations requires exploring both concurrency- and data nondeterminism. Unfortunately, most SMC algorithms focus on efficient exploration of concurrency nondeterminism, thereby neglecting an important source of bugs. We present ConDpor, an SMC algorithm for unmodified Java programs that combines optimal dynamic partial order reduction (DPOR) for concurrency nondeterminism, with concolic execution for data nondeterminism. ConDpor is sound, complete, optimal, and parametric w.r.t. the memory consistency model. Our experiments confirm that ConDpor is exponentially faster than DPOR with small-domain enumeration. Overall, ConDpor opens the door for efficient exploration of concurrent programs with data nondeterminism.

Cite as

Mohammad Hossein Khoshechin Jorshari, Michalis Kokologiannakis, Rupak Majumdar, and Srinidhi Nagendra. Optimal Concolic Dynamic Partial Order Reduction. In 36th International Conference on Concurrency Theory (CONCUR 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 348, pp. 26:1-26:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{khoshechinjorshari_et_al:LIPIcs.CONCUR.2025.26,
  author =	{Khoshechin Jorshari, Mohammad Hossein and Kokologiannakis, Michalis and Majumdar, Rupak and Nagendra, Srinidhi},
  title =	{{Optimal Concolic Dynamic Partial Order Reduction}},
  booktitle =	{36th International Conference on Concurrency Theory (CONCUR 2025)},
  pages =	{26:1--26:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-389-8},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{348},
  editor =	{Bouyer, Patricia and van de Pol, Jaco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2025.26},
  URN =		{urn:nbn:de:0030-drops-239765},
  doi =		{10.4230/LIPIcs.CONCUR.2025.26},
  annote =	{Keywords: Stateless model checking, dynamic symbolic execution}
}
Document
What, When, and Where Do You Mean? Detecting Spatio-Temporal Concept Drift in Scientific Texts

Authors: Meilin Shi, Krzysztof Janowicz, Zilong Liu, Mina Karimi, Ivan Majic, and Alexandra Fortacz

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)


Abstract
Inundated by the rapidly expanding AI research nowadays, the research community requires more effective research data management than ever. A key challenge lies in the evolving nature of concepts embedded in the growing body of research publications. As concepts evolve over time (e.g., keywords like global warming become more commonly referred to as climate change), past research may become harder to find and interpret in a modern context. This phenomenon, known as concept drift, affects how research topics and keywords are understood, categorized, and retrieved. Beyond temporal drift, such variations also occur across geographic space, reflecting differences in local policies, research priorities, and so forth. In this work, we introduce the notion of spatio-temporal concept drift to capture how concepts in scientific texts evolve across both space and time. Using a scientometric dataset in geographic information science, we detect how research keywords drifted across countries and years using word embeddings. By detecting spatio-temporal concept drift, we can better align archival research and bridge regional differences, ensuring scientific knowledge remains findable and interoperable within evolving research landscapes.

Cite as

Meilin Shi, Krzysztof Janowicz, Zilong Liu, Mina Karimi, Ivan Majic, and Alexandra Fortacz. What, When, and Where Do You Mean? Detecting Spatio-Temporal Concept Drift in Scientific Texts. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 16:1-16:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{shi_et_al:LIPIcs.GIScience.2025.16,
  author =	{Shi, Meilin and Janowicz, Krzysztof and Liu, Zilong and Karimi, Mina and Majic, Ivan and Fortacz, Alexandra},
  title =	{{What, When, and Where Do You Mean? Detecting Spatio-Temporal Concept Drift in Scientific Texts}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{16:1--16:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.16},
  URN =		{urn:nbn:de:0030-drops-238450},
  doi =		{10.4230/LIPIcs.GIScience.2025.16},
  annote =	{Keywords: Concept Drift, Ontology, Large Language Models, Research Data Management}
}
Document
Precomputed Topological Relations for Integrated Geospatial Analysis Across Knowledge Graphs

Authors: Katrina Schweikert, David K. Kedrowski, Shirly Stephen, and Torsten Hahmann

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)


Abstract
Geospatial Knowledge Graphs (GeoKGs) represent a significant advancement in the integration of AI-driven geographic information, facilitating interoperable and semantically rich geospatial analytics across various domains. This paper explores the use of topologically enriched GeoKGs, built on an explicit representation of S2 Geometry alongside precomputed topological relations, for constructing efficient geospatial analysis workflows within and across knowledge graphs (KGs). Using the SAWGraph knowledge graph as a case study focused on enviromental contamination by PFAS, we demonstrate how this framework supports fundamental GIS operations - such as spatial filtering, proximity analysis, overlay operations and network analysis - in a GeoKG setting while allowing for the easy linking of these operations with one another and with semantic filters. This enables the efficient execution of complex geospatial analyses as semantically-explicit queries and enhances the usability of geospatial data across graphs. Additionally, the framework eliminates the need for explicit support for GeoSPARQL’s topological operations in the utilized graph databases and better integrates spatial knowledge into the overall semantic inference process supported by RDFS and OWL ontologies.

Cite as

Katrina Schweikert, David K. Kedrowski, Shirly Stephen, and Torsten Hahmann. Precomputed Topological Relations for Integrated Geospatial Analysis Across Knowledge Graphs. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 4:1-4:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{schweikert_et_al:LIPIcs.GIScience.2025.4,
  author =	{Schweikert, Katrina and Kedrowski, David K. and Stephen, Shirly and Hahmann, Torsten},
  title =	{{Precomputed Topological Relations for Integrated Geospatial Analysis Across Knowledge Graphs}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{4:1--4:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.4},
  URN =		{urn:nbn:de:0030-drops-238332},
  doi =		{10.4230/LIPIcs.GIScience.2025.4},
  annote =	{Keywords: knowledge graph, GeoKG, spatial analysis, ontology, SPARQL, GeoSPARQL, discrete global grid system, S2 geometry, GeoAI, PFAS}
}
Document
Enriching Location Representation with Detailed Semantic Information

Authors: Junyuan Liu, Xinglei Wang, and Tao Cheng

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)


Abstract
Spatial representations that capture both structural and semantic characteristics of urban environments are essential for urban modeling. Traditional spatial embeddings often prioritize spatial proximity while underutilizing fine-grained contextual information from places. To address this limitation, we introduce CaLLiPer+, an extension of the CaLLiPer model that systematically integrates Point-of-Interest (POI) names alongside categorical labels within a multimodal contrastive learning framework. We evaluate its effectiveness on two downstream tasks - land use classification and socioeconomic status distribution mapping - demonstrating consistent performance gains of 4% to 11% over baseline methods. Additionally, we show that incorporating POI names enhances location retrieval, enabling models to capture complex urban concepts with greater precision. Ablation studies further reveal the complementary role of POI names and the advantages of leveraging pretrained text encoders for spatial representations. Overall, our findings highlight the potential of integrating fine-grained semantic attributes and multimodal learning techniques to advance the development of urban foundation models.

Cite as

Junyuan Liu, Xinglei Wang, and Tao Cheng. Enriching Location Representation with Detailed Semantic Information. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 3:1-3:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.GIScience.2025.3,
  author =	{Liu, Junyuan and Wang, Xinglei and Cheng, Tao},
  title =	{{Enriching Location Representation with Detailed Semantic Information}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{3:1--3:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.3},
  URN =		{urn:nbn:de:0030-drops-238322},
  doi =		{10.4230/LIPIcs.GIScience.2025.3},
  annote =	{Keywords: Location Embedding, Contrastive Learning, Pretrained Model}
}
Document
LoRaHART: Hardware-Aware Real-Time Scheduling for LoRa

Authors: Soumya Ranjan Sahoo, Amalinda Gamage, Niraj Kumar, and Arvind Easwaran

Published in: LIPIcs, Volume 335, 37th Euromicro Conference on Real-Time Systems (ECRTS 2025)


Abstract
Time-sensitive data acquisition is critical for many Low-Power Wide-Area Network (LPWAN) applications, such as healthcare monitoring and industrial Internet of Things. Among the available LPWAN technologies, LoRa (Long Range) has emerged as a leading choice, offering kilometer-scale communication with minimal power consumption and enabling high-density deployments across large areas. However, the conventional ALOHA-based Medium Access Control (MAC) in LoRa is not designed to support real-time communication over large-scale networks. This paper introduces LoRaHART, a novel approach that overcomes two critical, under-explored limitations in Commercial Off The Shelf (COTS) LoRa gateways that impact real-time performance. LoRa gateways have limited capacity for demodulation of parallel transmissions and their antenna can either transmit or receive at any time instant. LoRaHART incorporates a hardware-aware super-frame structure, comprising both Time Division Multiple Access (TDMA) slots as well as opportunistic retransmissions using Carrier Sense Multiple Access (CSMA), designed to mitigate the above constraints. We use a partial packing and makespan minimization algorithm to schedule periodic real-time transmissions efficiently within the TDMA slots, and also develop a probabilistic node contention model for CSMA retransmissions, providing analytical guarantees for deadline satisfaction under ideal channel conditions. Our evaluation of LoRaHART on a 40-node LoRa testbed demonstrates significant improvements over existing solutions in practice, achieving an average Packet Reception Ratio of 98% and a 45% higher airtime utilization than the best performing baseline.

Cite as

Soumya Ranjan Sahoo, Amalinda Gamage, Niraj Kumar, and Arvind Easwaran. LoRaHART: Hardware-Aware Real-Time Scheduling for LoRa. In 37th Euromicro Conference on Real-Time Systems (ECRTS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 335, pp. 10:1-10:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{sahoo_et_al:LIPIcs.ECRTS.2025.10,
  author =	{Sahoo, Soumya Ranjan and Gamage, Amalinda and Kumar, Niraj and Easwaran, Arvind},
  title =	{{LoRaHART: Hardware-Aware Real-Time Scheduling for LoRa}},
  booktitle =	{37th Euromicro Conference on Real-Time Systems (ECRTS 2025)},
  pages =	{10:1--10:28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-377-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{335},
  editor =	{Mancuso, Renato},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2025.10},
  URN =		{urn:nbn:de:0030-drops-235880},
  doi =		{10.4230/LIPIcs.ECRTS.2025.10},
  annote =	{Keywords: LoRa, LPWAN, Real-time Scheduling, Hardware Constraints}
}
Document
Track A: Algorithms, Complexity and Games
Fitting Tree Metrics and Ultrametrics in Data Streams

Authors: Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
Fitting distances to tree metrics and ultrametrics are two widely used methods in hierarchical clustering, primarily explored within the context of numerical taxonomy. Formally, given a positive distance function D: binom(V,2) → ℝ_{>0}, the goal is to find a tree (or an ultrametric) T including all elements of set V, such that the difference between the distances among vertices in T and those specified by D is minimized. Numerical taxonomy was first introduced by Sneath and Sokal [Nature 1962], and since then it has been studied extensively in both biology and computer science. In this paper, we initiate the study of ultrametric and tree metric fitting problems in the semi-streaming model, where the distances between pairs of elements from V (with |V| = n), defined by the function D, can arrive in an arbitrary order. We study these problems under various distance norms; namely the 𝓁₀ objective, which aims to minimize the number of modified entries in D to fit a tree-metric or an ultrametric; the 𝓁₁ objective, which seeks to minimize the total sum of distance errors across all pairs of points in V; and the 𝓁_∞ objective, which focuses on minimizing the maximum error incurred by any entries in D. - Our first result addresses the 𝓁₀ objective. We provide a single-pass polynomial-time Õ(n)-space O(1) approximation algorithm for ultrametrics and prove that no single-pass exact algorithm exists, even with exponential time. - Next, we show that the algorithm for 𝓁₀ implies an O(Δ/δ) approximation for the 𝓁₁ objective, where Δ is the maximum, and δ is the minimum absolute difference between distances in the input. This bound matches the best-known approximation for the RAM model using a combinatorial algorithm when Δ/δ = O(n). - For the 𝓁_∞ objective, we provide a complete characterization of the ultrametric fitting problem. First, we present a single-pass polynomial-time Õ(n)-space 2-approximation algorithm and show that no better than 2-approximation is possible, even with exponential time. Furthermore, we show that with an additional pass, it is possible to achieve a polynomial-time exact algorithm for ultrametrics. - Finally, we extend all these results to tree metrics by using only one additional pass through the stream and without asymptotically increasing the approximation factor.

Cite as

Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis. Fitting Tree Metrics and Ultrametrics in Data Streams. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 42:1-42:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{carmel_et_al:LIPIcs.ICALP.2025.42,
  author =	{Carmel, Amir and Das, Debarati and Kipouridis, Evangelos and Pipis, Evangelos},
  title =	{{Fitting Tree Metrics and Ultrametrics in Data Streams}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{42:1--42:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.42},
  URN =		{urn:nbn:de:0030-drops-234197},
  doi =		{10.4230/LIPIcs.ICALP.2025.42},
  annote =	{Keywords: Streaming, Clustering, Ultrametrics, Tree metrics, Distance fitting}
}
Document
Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation

Authors: Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
The use of Large Language Models (LLMs) for code generation has gained significant attention in recent years. Existing methods often aim to improve the quality of generated code by incorporating additional contextual information or guidance into input prompts. Many of these approaches adopt process-oriented reasoning strategies, mimicking human-like step-by-step thinking; however, they may not always align with the structured nature of programming languages. This paper introduces Chain of Grounded Objectives (CGO), a concise goal-oriented prompting approach that embeds functional objectives into prompts to enhance code generation. By focusing on precisely defined objectives rather than explicit procedural steps, CGO aligns more naturally with programming tasks while retaining flexibility. Empirical evaluations on HumanEval, MBPP, their extended versions, and LiveCodeBench show that CGO achieves accuracy comparable to or better than existing methods while using fewer tokens, making it a more efficient approach to LLM-based code generation.

Cite as

Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma. Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 35:1-35:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{yeo_et_al:LIPIcs.ECOOP.2025.35,
  author =	{Yeo, Sangyeop and Hwang, Seung-Won and Ma, Yu-Seung},
  title =	{{Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{35:1--35:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.35},
  URN =		{urn:nbn:de:0030-drops-233271},
  doi =		{10.4230/LIPIcs.ECOOP.2025.35},
  annote =	{Keywords: Artificial Intelligence, Natural Language Processing, Prompt Design, Large Language Models, Code Generation}
}
Document
Repairing Databases over Metric Spaces with Coincidence Constraints

Authors: Youri Kaminsky, Benny Kimelfeld, Ester Livshits, Felix Naumann, and David Wajc

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)


Abstract
Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a vector space, and so on. We study the computational complexity of repairing inconsistent databases that violate integrity constraints, where the database values belong to an underlying metric space. The goal is to update the database values to retain consistency while minimizing the total distance between the original values and the repaired ones. We consider what we refer to as coincidence constraints, which include unary key constraints, inclusion constraints, foreign keys, and generally any restriction on the relationship between the numbers of cells of different labels (attributes) coinciding in a single value, for a fixed attribute set. We begin by showing that the problem is APX-hard for general metric spaces. We then present an algorithm solving the problem optimally for tree metrics, which generalize both the line metric (i.e., where repaired values are numbers) and the discrete metric (i.e., where we simply count the number of changed values). Combining our algorithm for tree metrics and a classic result on probabilistic tree embeddings, we design a (high probability) logarithmic-ratio approximation for general metrics. We also study the variant of the problem where we limit the allowed change of each individual value. In this variant, it is already NP-complete to decide the existence of any legal repair for a general metric, and we present a polynomial-time repairing algorithm for the case of a line metric.

Cite as

Youri Kaminsky, Benny Kimelfeld, Ester Livshits, Felix Naumann, and David Wajc. Repairing Databases over Metric Spaces with Coincidence Constraints. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{kaminsky_et_al:LIPIcs.ICDT.2025.14,
  author =	{Kaminsky, Youri and Kimelfeld, Benny and Livshits, Ester and Naumann, Felix and Wajc, David},
  title =	{{Repairing Databases over Metric Spaces with Coincidence Constraints}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{14:1--14:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.14},
  URN =		{urn:nbn:de:0030-drops-229554},
  doi =		{10.4230/LIPIcs.ICDT.2025.14},
  annote =	{Keywords: Database repairs, metric spaces, coincidence constraints, inclusion constraints, foreign-key constraints}
}
Document
Academic Track
A View on Vulnerabilites: The Security Challenges of XAI (Academic Track)

Authors: Elisabeth Pachl, Fabian Langer, Thora Markert, and Jeanette Miriam Lorenz

Published in: OASIcs, Volume 126, Symposium on Scaling AI Assessments (SAIA 2024)


Abstract
Modern deep learning methods have long been considered as black-boxes due to their opaque decision-making processes. Explainable Artificial Intelligence (XAI), however, has turned the tables: it provides insight into how these models work, promoting transparency that is crucial for accountability. Yet, recent developments in adversarial machine learning have highlighted vulnerabilities in XAI methods, raising concerns about security, reliability and trustworthiness, particularly in sensitive areas like healthcare and autonomous systems. Awareness of the potential risks associated with XAI is needed as its adoption increases, driven in part by the need to enhance compliance to regulations. This survey provides a holistic perspective on the security and safety landscape surrounding XAI, categorizing research on adversarial attacks against XAI and the misuse of explainability to enhance attacks on AI systems, such as evasion and privacy breaches. Our contribution includes identifying current insecurities in XAI and outlining future research directions in adversarial XAI. This work serves as an accessible foundation and outlook to recognize potential research gaps and define future directions. It identifies data modalities, such as time-series or graph data, and XAI methods that have not been extensively investigated for vulnerabilities in current research.

Cite as

Elisabeth Pachl, Fabian Langer, Thora Markert, and Jeanette Miriam Lorenz. A View on Vulnerabilites: The Security Challenges of XAI (Academic Track). In Symposium on Scaling AI Assessments (SAIA 2024). Open Access Series in Informatics (OASIcs), Volume 126, pp. 12:1-12:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{pachl_et_al:OASIcs.SAIA.2024.12,
  author =	{Pachl, Elisabeth and Langer, Fabian and Markert, Thora and Lorenz, Jeanette Miriam},
  title =	{{A View on Vulnerabilites: The Security Challenges of XAI}},
  booktitle =	{Symposium on Scaling AI Assessments (SAIA 2024)},
  pages =	{12:1--12:23},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-357-7},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{126},
  editor =	{G\"{o}rge, Rebekka and Haedecke, Elena and Poretschkin, Maximilian and Schmitz, Anna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SAIA.2024.12},
  URN =		{urn:nbn:de:0030-drops-227523},
  doi =		{10.4230/OASIcs.SAIA.2024.12},
  annote =	{Keywords: Explainability, XAI, Transparency, Adversarial Machine Learning, Security, Vulnerabilities}
}
Document
Resource Paper
Whelk: An OWL EL+RL Reasoner Enabling New Use Cases

Authors: James P. Balhoff and Christopher J. Mungall

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2


Abstract
Many tasks in the biosciences rely on reasoning with large OWL terminologies (Tboxes), often combined with even larger databases. In particular, a common task is retrieval queries that utilize relational expressions; for example, “find all genes expressed in the brain or any part of the brain”. Automated reasoning on these ontologies typically relies on scalable reasoners targeting the EL subset of OWL, such as ELK. While the introduction of ELK has been transformative in the incorporation of reasoning into bio-ontology quality control and production pipelines, we have encountered limitations when applying it to use cases involving high throughput query answering or reasoning about datasets describing instances (Aboxes). Whelk is a fast OWL reasoner for combined EL+RL reasoning. As such, it is particularly useful for many biological ontology tasks, particularly those characterized by large Tboxes using the EL subset of OWL, combined with Aboxes targeting the RL subset of OWL. Whelk is implemented in Scala and utilizes immutable functional data structures, which provides advantages when performing incremental or dynamic reasoning tasks. Whelk supports querying complex class expressions at a substantially greater rate than ELK, and can answer queries or perform incremental reasoning tasks in parallel, enabling novel applications of OWL reasoning.

Cite as

James P. Balhoff and Christopher J. Mungall. Whelk: An OWL EL+RL Reasoner Enabling New Use Cases. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{balhoff_et_al:TGDK.2.2.7,
  author =	{Balhoff, James P. and Mungall, Christopher J.},
  title =	{{Whelk: An OWL EL+RL Reasoner Enabling New Use Cases}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{7:1--7:17},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.7},
  URN =		{urn:nbn:de:0030-drops-225918},
  doi =		{10.4230/TGDK.2.2.7},
  annote =	{Keywords: Web Ontology Language, OWL, Semantic Web, ontology, reasoner}
}
  • Refine by Type
  • 26 Document/PDF
  • 18 Document/HTML

  • Refine by Publication Year
  • 14 2025
  • 3 2024
  • 4 2023
  • 1 2022
  • 1 2021
  • Show More...

  • Refine by Author
  • 7 Gu, Yan
  • 4 Sun, Yihan
  • 3 Blelloch, Guy E.
  • 3 Shun, Julian
  • 2 Wang, Yiqiu
  • Show More...

  • Refine by Series/Journal
  • 17 LIPIcs
  • 2 OASIcs
  • 7 TGDK

  • Refine by Classification
  • 3 Theory of computation → Design and analysis of algorithms
  • 2 Applied computing → Life and medical sciences
  • 2 Computing methodologies → Knowledge representation and reasoning
  • 2 Computing methodologies → Machine learning
  • 2 Computing methodologies → Shared memory algorithms
  • Show More...

  • Refine by Keyword
  • 3 Parallel Algorithms
  • 2 Large Language Models
  • 2 Lower Bounds
  • 2 ontology
  • 1 Adversarial Machine Learning
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail