2 Search Results for "Lam, Jenny"


Document
Research
Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}
Document
The Subset Assignment Problem for Data Placement in Caches

Authors: Shahram Ghandeharizadeh, Sandy Irani, and Jenny Lam

Published in: LIPIcs, Volume 64, 27th International Symposium on Algorithms and Computation (ISAAC 2016)


Abstract
We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number n of data objects (items) is very large and the number d of memory banks (bins) is a small constant (on the order of 3 or 4). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on n. The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most d. If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time O(binom{3^d}{d+1} poly(d) n log(n) log(nC) log(Z)), where Z is the maximum item size and C the maximum storage cost.

Cite as

Shahram Ghandeharizadeh, Sandy Irani, and Jenny Lam. The Subset Assignment Problem for Data Placement in Caches. In 27th International Symposium on Algorithms and Computation (ISAAC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 64, pp. 35:1-35:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{ghandeharizadeh_et_al:LIPIcs.ISAAC.2016.35,
  author =	{Ghandeharizadeh, Shahram and Irani, Sandy and Lam, Jenny},
  title =	{{The Subset Assignment Problem for Data Placement in Caches}},
  booktitle =	{27th International Symposium on Algorithms and Computation (ISAAC 2016)},
  pages =	{35:1--35:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-026-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{64},
  editor =	{Hong, Seok-Hee},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2016.35},
  URN =		{urn:nbn:de:0030-drops-68058},
  doi =		{10.4230/LIPIcs.ISAAC.2016.35},
  annote =	{Keywords: Memory management, caching, simplex method, linear programming, min-cost flow}
}
  • Refine by Type
  • 2 Document/PDF
  • 1 Document/HTML

  • Refine by Publication Year
  • 1 2025
  • 1 2016

  • Refine by Author
  • 1 Bernstein, Abraham
  • 1 Ghandeharizadeh, Shahram
  • 1 Irani, Sandy
  • 1 Lam, Jenny
  • 1 Ruosch, Florian
  • Show More...

  • Refine by Series/Journal
  • 1 LIPIcs
  • 1 TGDK

  • Refine by Classification
  • 1 Computing methodologies → Information extraction
  • 1 Computing methodologies → Language resources
  • 1 Computing methodologies → Semantic networks
  • 1 Information systems → Graph-based database models

  • Refine by Keyword
  • 1 Argument Mining
  • 1 Knowledge Graphs
  • 1 Large Language Models
  • 1 Link Prediction
  • 1 Memory management
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail