10 Search Results for "Zhou, Yuchen"


Document
Research
Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}
Document
Approximation Schemes for k-Subset Sum Ratio and k-Way Number Partitioning Ratio

Authors: Sotiris Kanellopoulos, Giorgos Mitropoulos, Antonis Antonopoulos, Nikos Leonardos, Aris Pagourtzis, Christos Pergaminelis, Stavros Petsalakis, and Kanellos Tsitouras

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)


Abstract
The Subset Sum Ratio problem (SSR) asks, given a multiset A of positive integers, to find two disjoint subsets of A such that the largest-to-smallest ratio of their sums is minimized. In this paper we study the k-version of SSR, namely k-Subset Sum Ratio (k-SSR), which asks to minimize the largest-to-smallest ratio of sums of k disjoint subsets of A. We develop an approximation scheme for k-SSR running in O(n^{2k}/ε^{k-1}) time, where n = |A| and ε is the error parameter. To the best of our knowledge, this is the first FPTAS for k-SSR for fixed k > 2. We also study the k-way Number Partitioning Ratio (k-PART) problem, which differs from k-SSR in that the k subsets must constitute a partition of A; this problem in fact corresponds to the objective of minimizing the largest-to-smallest sum ratio in the family of Multiway Number Partitioning problems. We present a more involved FPTAS for k-PART, also achieving O(n^{2k}/ε^{k-1}) time complexity. Notably, k-PART is also equivalent to the Minimum Envy-Ratio problem with identical valuation functions, which has been studied in the context of fair division of indivisible goods. Thus, for the case of identical valuations, our FPTAS represents a significant improvement over the O(n^{4k²+1}/ε^{2k²}) bound obtained by Nguyen and Rothe’s FPTAS [Trung Thanh Nguyen and Jörg Rothe, 2014] for Minimum Envy-Ratio with general additive valuations. Lastly, we propose a second FPTAS for k-SSR, which employs carefully designed calls to the first one; the new scheme has a time complexity of Õ(n/ε^{3k-1}), thus being much faster when n≫ 1/ ε.

Cite as

Sotiris Kanellopoulos, Giorgos Mitropoulos, Antonis Antonopoulos, Nikos Leonardos, Aris Pagourtzis, Christos Pergaminelis, Stavros Petsalakis, and Kanellos Tsitouras. Approximation Schemes for k-Subset Sum Ratio and k-Way Number Partitioning Ratio. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 44:1-44:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{kanellopoulos_et_al:LIPIcs.ISAAC.2025.44,
  author =	{Kanellopoulos, Sotiris and Mitropoulos, Giorgos and Antonopoulos, Antonis and Leonardos, Nikos and Pagourtzis, Aris and Pergaminelis, Christos and Petsalakis, Stavros and Tsitouras, Kanellos},
  title =	{{Approximation Schemes for k-Subset Sum Ratio and k-Way Number Partitioning Ratio}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{44:1--44:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.44},
  URN =		{urn:nbn:de:0030-drops-249521},
  doi =		{10.4230/LIPIcs.ISAAC.2025.44},
  annote =	{Keywords: Fully polynomial-time approximation schemes, Subset Sum Ratio, Number Partitioning, Fair division, Envy minimization, Pseudo-polynomial time algorithms}
}
Document
PIPQ: Strict Insert-Optimized Concurrent Priority Queue

Authors: Olivia Grimes, Ahmed Hassan, Panagiota Fatourou, and Roberto Palmieri

Published in: LIPIcs, Volume 356, 39th International Symposium on Distributed Computing (DISC 2025)


Abstract
This paper presents PIPQ, a strict and linearizable concurrent priority queue whose design differs from existing solutions in literature because it focuses on enabling parallelism of insert operations as opposed to accelerating delete-min operations, as traditionally done. In a nutshell, PIPQ’s structure includes two levels: the worker level and the leader level. The worker level provides per-thread data structures enabling fast and parallel insertions. The leader level contains the highest priority elements in the priority queue and can thus serve delete-min operations. Our evaluation, which includes an exploration of different data access patterns, operation mixes, runtime settings, and an integration into a graph-based application, shows that PIPQ outperforms competitors in a variety of cases, especially with insert-dominant workloads.

Cite as

Olivia Grimes, Ahmed Hassan, Panagiota Fatourou, and Roberto Palmieri. PIPQ: Strict Insert-Optimized Concurrent Priority Queue. In 39th International Symposium on Distributed Computing (DISC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 356, pp. 35:1-35:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{grimes_et_al:LIPIcs.DISC.2025.35,
  author =	{Grimes, Olivia and Hassan, Ahmed and Fatourou, Panagiota and Palmieri, Roberto},
  title =	{{PIPQ: Strict Insert-Optimized Concurrent Priority Queue}},
  booktitle =	{39th International Symposium on Distributed Computing (DISC 2025)},
  pages =	{35:1--35:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-402-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{356},
  editor =	{Kowalski, Dariusz R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2025.35},
  URN =		{urn:nbn:de:0030-drops-248525},
  doi =		{10.4230/LIPIcs.DISC.2025.35},
  annote =	{Keywords: Priority Queue, Concurrent Data Structures, Synchronization}
}
Document
Survey
Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}
Document
Bridging Language Models and Symbolic Solvers via the Model Context Protocol

Authors: Stefan Szeider

Published in: LIPIcs, Volume 341, 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)


Abstract
This paper presents the MCP Solver, a system that bridges large language models with symbolic solvers through the Model Context Protocol (MCP). The system includes a server and a client component. The server provides an interface to constraint programming (via MiniZinc Python), propositional satisfiability and maximum satisfiability (both via PySAT), and SAT modulo Theories (via Python Z3). The client contains an agent that connects to the server via MCP and uses a language model to autonomously translate problem statements (given in English) into encodings through an incremental editing process and runs the solver. Our experiments demonstrate that this neurosymbolic integration effectively combines the natural language understanding of language models with robust solving capabilities across multiple solving paradigms.

Cite as

Stefan Szeider. Bridging Language Models and Symbolic Solvers via the Model Context Protocol. In 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 341, pp. 30:1-30:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{szeider:LIPIcs.SAT.2025.30,
  author =	{Szeider, Stefan},
  title =	{{Bridging Language Models and Symbolic Solvers via the Model Context Protocol}},
  booktitle =	{28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)},
  pages =	{30:1--30:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-381-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{341},
  editor =	{Berg, Jeremias and Nordstr\"{o}m, Jakob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SAT.2025.30},
  URN =		{urn:nbn:de:0030-drops-237649},
  doi =		{10.4230/LIPIcs.SAT.2025.30},
  annote =	{Keywords: Large Language Models, Agents, Constraint Programming, Satisfiability Solvers, Maximum Satisfiability, SAT Modulo Theories, Model Context Protocol}
}
Document
Track A: Algorithms, Complexity and Games
Fourier Analysis of Iterative Algorithms

Authors: Chris Jones and Lucas Pesenti

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
We study a general class of nonlinear iterative algorithms which includes power iteration, belief propagation and approximate message passing, and many forms of gradient descent. When the input is a random matrix with i.i.d. entries, we use Boolean Fourier analysis to analyze these algorithms as low-degree polynomials in the entries of the input matrix. Each symmetrized Fourier character represents all monomials with a certain shape as specified by a small graph, which we call a Fourier diagram. We prove fundamental asymptotic properties of the Fourier diagrams: over the randomness of the input, all diagrams with cycles are negligible; the tree-shaped diagrams form a basis of asymptotically independent Gaussian vectors; and, when restricted to the trees, iterative algorithms exactly follow an idealized Gaussian dynamic. We use this to prove a state evolution formula, giving a "complete" asymptotic description of the algorithm’s trajectory. The restriction to tree-shaped monomials mirrors the assumption of the cavity method, a 40-year-old non-rigorous technique in statistical physics which has served as one of the most important techniques in the field. We demonstrate how to implement cavity method derivations by 1) restricting the iteration to its tree approximation, and 2) observing that heuristic cavity method-type arguments hold rigorously on the simplified iteration. Our proofs use combinatorial arguments similar to the trace method from random matrix theory. Finally, we push the diagram analysis to a number of iterations that scales with the dimension n of the input matrix, proving that the tree approximation still holds for a simple variant of power iteration all the way up to n^{Ω(1)} iterations.

Cite as

Chris Jones and Lucas Pesenti. Fourier Analysis of Iterative Algorithms. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 102:1-102:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{jones_et_al:LIPIcs.ICALP.2025.102,
  author =	{Jones, Chris and Pesenti, Lucas},
  title =	{{Fourier Analysis of Iterative Algorithms}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{102:1--102:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.102},
  URN =		{urn:nbn:de:0030-drops-234791},
  doi =		{10.4230/LIPIcs.ICALP.2025.102},
  annote =	{Keywords: Iterative Algorithms, Message-passing Algorithms, Random Matrix Theory}
}
Document
Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation

Authors: Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
The use of Large Language Models (LLMs) for code generation has gained significant attention in recent years. Existing methods often aim to improve the quality of generated code by incorporating additional contextual information or guidance into input prompts. Many of these approaches adopt process-oriented reasoning strategies, mimicking human-like step-by-step thinking; however, they may not always align with the structured nature of programming languages. This paper introduces Chain of Grounded Objectives (CGO), a concise goal-oriented prompting approach that embeds functional objectives into prompts to enhance code generation. By focusing on precisely defined objectives rather than explicit procedural steps, CGO aligns more naturally with programming tasks while retaining flexibility. Empirical evaluations on HumanEval, MBPP, their extended versions, and LiveCodeBench show that CGO achieves accuracy comparable to or better than existing methods while using fewer tokens, making it a more efficient approach to LLM-based code generation.

Cite as

Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma. Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 35:1-35:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{yeo_et_al:LIPIcs.ECOOP.2025.35,
  author =	{Yeo, Sangyeop and Hwang, Seung-Won and Ma, Yu-Seung},
  title =	{{Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{35:1--35:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.35},
  URN =		{urn:nbn:de:0030-drops-233271},
  doi =		{10.4230/LIPIcs.ECOOP.2025.35},
  annote =	{Keywords: Artificial Intelligence, Natural Language Processing, Prompt Design, Large Language Models, Code Generation}
}
Document
In-Memory Object Graph Stores

Authors: Aditya Thimmaiah, Zijian Yi, Joseph Kenis, Christopher J Rossbach, and Milos Gligoric

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
We present a design and implementation of an in-memory object graph store, dubbed εStore. Our key innovation is a storage model - epsilon store - that equates an object on the heap to a node in a graph store. Thus any object on the heap (without changes) can be a part of one, or multiple, graph stores, and vice versa, any node in a graph store can be accessed like any other object on the heap. Specifically, each node in a graph is an object (i.e., instance of a class), and its properties and its edges are the primitive and reference fields declared in its class, respectively. Necessary classes, which are instantiated to represent nodes, are created dynamically by εStore. εStore uses a subset of the Cypher query language to query the graph store. By design, the result of any query is a table (ResultSet) of references to objects on the heap, which users can manipulate the same way as any other object on the heap in their programs. Moreover, a developer can include (transitively) an arbitrary object to become a part of a graph store. Finally, εStore introduces compile-time rewriting of Cypher queries into imperative code to improve the runtime performance. εStore can be used for a number of tasks including implementing methods for complex in-memory structures, writing complex assertions, or a stripped down version of a graph database that can conveniently be used during testing. We implement εStore in Java and show its application using the aforementioned tasks.

Cite as

Aditya Thimmaiah, Zijian Yi, Joseph Kenis, Christopher J Rossbach, and Milos Gligoric. In-Memory Object Graph Stores. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 30:1-30:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{thimmaiah_et_al:LIPIcs.ECOOP.2025.30,
  author =	{Thimmaiah, Aditya and Yi, Zijian and Kenis, Joseph and Rossbach, Christopher J and Gligoric, Milos},
  title =	{{In-Memory Object Graph Stores}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{30:1--30:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.30},
  URN =		{urn:nbn:de:0030-drops-233225},
  doi =		{10.4230/LIPIcs.ECOOP.2025.30},
  annote =	{Keywords: Object stores, Graph stores, Cypher}
}
Document
Survey
Semantic Web: Past, Present, and Future

Authors: Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
Ever since the vision was formulated, the Semantic Web has inspired many generations of innovations. Semantic technologies have been used to share vast amounts of information on the Web, enhance them with semantics to give them meaning, and enable inference and reasoning on them. Throughout the years, semantic technologies, and in particular knowledge graphs, have been used in search engines, data integration, enterprise settings, and machine learning. In this paper, we recap the classical concepts and foundations of the Semantic Web as well as modern and recent concepts and applications, building upon these foundations. The classical topics we cover include knowledge representation, creating and validating knowledge on the Web, reasoning and linking, and distributed querying. We enhance this classical view of the so-called "Semantic Web Layer Cake" with an update of recent concepts that include provenance, security and trust, as well as a discussion of practical impacts from industry-led contributions. We conclude with an outlook on the future directions of the Semantic Web. This is a living document. If you like to contribute, please contact the first author and visit: https://github.com/ascherp/semantic-web-primer

Cite as

Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal. Semantic Web: Past, Present, and Future. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 3:1-3:37, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{scherp_et_al:TGDK.2.1.3,
  author =	{Scherp, Ansgar and Groener, Gerd and \v{S}koda, Petr and Hose, Katja and Vidal, Maria-Esther},
  title =	{{Semantic Web: Past, Present, and Future}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:37},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.3},
  URN =		{urn:nbn:de:0030-drops-198607},
  doi =		{10.4230/TGDK.2.1.3},
  annote =	{Keywords: Linked Open Data, Semantic Web Graphs, Knowledge Graphs}
}
Document
Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

Authors: Chen Dan, Kristoffer Arnsfelt Hansen, He Jiang, Liwei Wang, and Yuchen Zhou

Published in: LIPIcs, Volume 117, 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)


Abstract
Low rank approximation of matrices is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns, and provides concise representations for the data. Research on low rank approximation usually focuses on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrices. Here we are given a d x n binary matrix A and a small integer k < d. The goal is to find two binary matrices U and V of sizes d x k and k x n respectively, so that the Frobenius norm of A - U V is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The GF(2) model and the Boolean semiring model. Unlike low rank approximation of a real matrix which can be efficiently solved by Singular Value Decomposition, we show that approximation of a binary matrix is NP-hard, even for k=1. In this paper, our main concern is the problem of Column Subset Selection (CSS), in which the low rank matrix U must be formed by k columns of the data matrix, and we are interested in the approximation ratio achievable by CSS for binary matrices. For the GF(2) model, we show that CSS has approximation ratio bounded by k/2+1+k/(2(2^k-1)) and this is asymptotically tight. For the Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of U are generated from Boolean formulas operating bitwise on selected columns of the data matrix. We show that the approximation ratio achieved by GCSS is bounded by 2^(k-1)+1, and argue that an exponential dependency on k is seems inherent.

Cite as

Chen Dan, Kristoffer Arnsfelt Hansen, He Jiang, Liwei Wang, and Yuchen Zhou. Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 117, pp. 41:1-41:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{dan_et_al:LIPIcs.MFCS.2018.41,
  author =	{Dan, Chen and Hansen, Kristoffer Arnsfelt and Jiang, He and Wang, Liwei and Zhou, Yuchen},
  title =	{{Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations}},
  booktitle =	{43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)},
  pages =	{41:1--41:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-086-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{117},
  editor =	{Potapov, Igor and Spirakis, Paul and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2018.41},
  URN =		{urn:nbn:de:0030-drops-96239},
  doi =		{10.4230/LIPIcs.MFCS.2018.41},
  annote =	{Keywords: Approximation Algorithms, Low Rank Approximation, Binary Matrices}
}
  • Refine by Type
  • 10 Document/PDF
  • 9 Document/HTML

  • Refine by Publication Year
  • 8 2025
  • 1 2024
  • 1 2018

  • Refine by Author
  • 1 Antonopoulos, Antonis
  • 1 Bernstein, Abraham
  • 1 Dan, Chen
  • 1 Fatourou, Panagiota
  • 1 Gligoric, Milos
  • Show More...

  • Refine by Series/Journal
  • 7 LIPIcs
  • 3 TGDK

  • Refine by Classification
  • 2 Computing methodologies → Artificial intelligence
  • 2 Computing methodologies → Knowledge representation and reasoning
  • 2 Theory of computation → Approximation algorithms analysis
  • 1 Computing methodologies → Factorization methods
  • 1 Computing methodologies → Information extraction
  • Show More...

  • Refine by Keyword
  • 3 Large Language Models
  • 2 Knowledge Graphs
  • 1 Agents
  • 1 Approximation Algorithms
  • 1 Argument Mining
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail