16 Search Results for "Tan, Jane"


Document
Samplability Makes Learning Easier

Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
The standard definition of PAC learning (Valiant 1984) requires learners to succeed under all distributions - even ones that are intractable to sample from. This stands in contrast to samplable PAC learning (Blum, Furst, Kearns, and Lipton 1993), where learners only have to succeed under samplable distributions. We study this distinction and show that samplable PAC substantially expands the power of efficient learners. We first construct a concept class that requires exponential sample complexity in standard PAC but is learnable with polynomial sample complexity in samplable PAC. We then lift this statistical separation to the computational setting and obtain a separation relative to a random oracle. Our proofs center around a new complexity primitive, explicit evasive sets, that we introduce and study. These are sets for which membership is easy to determine but are extremely hard to sample from. Our results extend to the online setting to similarly show that its landscape changes when the adversary is assumed to be efficient instead of computationally unbounded.

Cite as

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Samplability Makes Learning Easier. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 20:1-20:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2026.20,
  author =	{Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang},
  title =	{{Samplability Makes Learning Easier}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{20:1--20:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.20},
  URN =		{urn:nbn:de:0030-drops-253071},
  doi =		{10.4230/LIPIcs.ITCS.2026.20},
  annote =	{Keywords: PAC learning, Samplable distributions}
}
Document
Cloning Games, Black Holes and Cryptography

Authors: Alexander Poremba, Seyoon Ragavan, and Vinod Vaikuntanathan

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
In this work, we introduce a new toolkit for analyzing cloning games, a notion that captures stronger and more quantitative versions of the celebrated quantum no-cloning theorem. This framework allows us to analyze a new cloning game based on binary phase states. Our results provide evidence that these games may be able to overcome important limitations of previous candidates based on BB84 states and subspace coset states: in a model where the adversaries are restricted to making a single oracle query, we show that the binary phase variant is t-copy secure when t = o(n/log n). Moreover, for constant t, we obtain the first optimal bounds of O(2^{-n}), asymptotically matching the value attained by a trivial adversarial strategy. We also show a worst-case to average-case reduction which allows us to show the same quantitative results for the new and natural notion of Haar cloning games. Our analytic toolkit, which we believe will find further applications, is based on binary subtypes and uses novel bounds on the operator norms of block-wise tensor products of matrices. To illustrate the effectiveness of these new techniques, we present two applications: first, in black-hole physics, where our asymptotically optimal bound offers quantitative insights into information scrambling in idealized models of black holes; and second, in unclonable cryptography, where we (a) construct succinct unclonable encryption schemes from the existence of pseudorandom unitaries, and (b) propose and provide evidence for the security of multi-copy unclonable encryption schemes.

Cite as

Alexander Poremba, Seyoon Ragavan, and Vinod Vaikuntanathan. Cloning Games, Black Holes and Cryptography. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 109:1-109:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{poremba_et_al:LIPIcs.ITCS.2026.109,
  author =	{Poremba, Alexander and Ragavan, Seyoon and Vaikuntanathan, Vinod},
  title =	{{Cloning Games, Black Holes and Cryptography}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{109:1--109:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.109},
  URN =		{urn:nbn:de:0030-drops-253961},
  doi =		{10.4230/LIPIcs.ITCS.2026.109},
  annote =	{Keywords: Unclonable cryptography, quantum pseudorandomness, black hole physics}
}
Document
Limitations of Membership Queries in Testable Learning

Authors: Jane Lange and Mingda Qiao

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
Membership queries (MQ) often yield speedups for learning tasks, particularly in the distribution-specific setting. We show that in the testable learning model of Rubinfeld and Vasilyan [Rubinfeld and Vasilyan, 2023], membership queries cannot decrease the time complexity of testable learning algorithms beyond the complexity of sample-only distribution-specific learning. In the testable learning model, the learner must output a hypothesis whenever the data distribution satisfies a desired property, and if it outputs a hypothesis, the hypothesis must be near-optimal. We give a general reduction from sample-based refutation of boolean concept classes, as presented in [Vadhan, 2017; Kothari and Livni, 2018], to testable learning with queries (TL-Q). This yields lower bounds for TL-Q via the reduction from learning to refutation given in [Kothari and Livni, 2018]. The result is that, relative to a concept class and a distribution family, no m-sample TL-Q algorithm can be super-polynomially more time-efficient than the best m-sample PAC learner. Finally, we define a class of "statistical" MQ algorithms that encompasses many known distribution-specific MQ learners, such as those based on influence estimation or subcube-conditional statistical queries. We show that TL-Q algorithms in this class imply efficient statistical-query refutation and learning algorithms. Thus, combined with known SQ dimension lower bounds, our results imply that these efficient membership query learners cannot be made testable.

Cite as

Jane Lange and Mingda Qiao. Limitations of Membership Queries in Testable Learning. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 91:1-91:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{lange_et_al:LIPIcs.ITCS.2026.91,
  author =	{Lange, Jane and Qiao, Mingda},
  title =	{{Limitations of Membership Queries in Testable Learning}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{91:1--91:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.91},
  URN =		{urn:nbn:de:0030-drops-253785},
  doi =		{10.4230/LIPIcs.ITCS.2026.91},
  annote =	{Keywords: Testable learning, PAC learning}
}
Document
From Prediction to Precision: Leveraging LLMs for Equitable and Data-Driven Writing Placement in Developmental Education

Authors: Miguel Da Corte and Jorge Baptista

Published in: OASIcs, Volume 135, 14th Symposium on Languages, Applications and Technologies (SLATE 2025)


Abstract
Accurate text classification and placement remain challenges in U.S. higher education, with traditional automated systems like Accuplacer functioning as "black-box" models with limited assessment transparency. This study evaluates Large Language Models (LLMs) as complementary placement tools by comparing their classification performance against a human-rated gold standard and Accuplacer. A 450-essay corpus was classified using Claude, Gemini, GPT-3.5-turbo, and GPT-4o across four prompting strategies: Zero-shot, Few-shot, Enhanced, and Enhanced+ (definitions with examples). Two classification approaches were tested: (i) a 1-step, 3 class classification task, distinguishing DevEd Level 1, DevEd Level 2, and College-level texts in one single run; and (ii) a 2-step classification task, first separating College vs. Non-College texts before further classifying Non-College texts into DevEd sublevels. The results show that structured prompt refinement improves the precision of LLMs' classification, with Claude Enhanced + achieving 62.22% precision (1 step) and Gemini Enhanced + reaching 69.33% (2 step), both surpassing Accuplacer (58.22%). Gemini and Claude also demonstrated strong correlation with human ratings, with Claude achieving the highest Pearson scores (ρ = 0.75; 1-step, ρ = 0.73; 2-step) vs. Accuplacer (ρ = 0.67). While LLMs show promise for DevEd placement, their precision remains a work in progress, highlighting the need for further refinement and safeguards to ensure ethical and equitable placement.

Cite as

Miguel Da Corte and Jorge Baptista. From Prediction to Precision: Leveraging LLMs for Equitable and Data-Driven Writing Placement in Developmental Education. In 14th Symposium on Languages, Applications and Technologies (SLATE 2025). Open Access Series in Informatics (OASIcs), Volume 135, pp. 1:1-1:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{dacorte_et_al:OASIcs.SLATE.2025.1,
  author =	{Da Corte, Miguel and Baptista, Jorge},
  title =	{{From Prediction to Precision: Leveraging LLMs for Equitable and Data-Driven Writing Placement in Developmental Education}},
  booktitle =	{14th Symposium on Languages, Applications and Technologies (SLATE 2025)},
  pages =	{1:1--1:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-387-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{135},
  editor =	{Baptista, Jorge and Barateiro, Jos\'{e}},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2025.1},
  URN =		{urn:nbn:de:0030-drops-236817},
  doi =		{10.4230/OASIcs.SLATE.2025.1},
  annote =	{Keywords: Large Language Models (LLMs), Developmental Education (DevEd), writing assessment, text classification, English writing proficiency}
}
Document
Recognizing 2-Layer and Outer k-Planar Graphs

Authors: Yasuaki Kobayashi, Yuto Okada, and Alexander Wolff

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)


Abstract
The crossing number of a graph is the least number of crossings over all drawings of the graph in the plane. Computing the crossing number of a given graph is NP-hard, but fixed-parameter tractable (FPT) with respect to the natural parameter. Two well-known variants of the problem are 2-layer crossing minimization and circular crossing minimization, where every vertex must lie on one of two layers, namely two parallel lines, or a circle, respectively. In both cases, edges are drawn as straight-line segments. Both variants are NP-hard, but admit FPT-algorithms with respect to the natural parameter. In recent years, in the context of beyond-planar graphs, a local version of the crossing number has also received considerable attention. A graph is k-planar if it admits a drawing with at most k crossings per edge. In contrast to the crossing number, recognizing k-planar graphs is NP-hard even if k = 1 and hence not likely to be FPT with respect to the natural parameter k. In this paper, we consider the two above variants in the local setting. The k-planar graphs that admit a straight-line drawing with vertices on two layers or on a circle are called 2-layer k-planar and outer k-planar graphs, respectively. We study the parameterized complexity of the two recognition problems with respect to the natural parameter k. For k = 0, the two classes of graphs are exactly the caterpillars and outerplanar graphs, respectively, which can be recognized in linear time. Two groups of researchers independently showed that outer 1-planar graphs can also be recognized in linear time [Hong et al., Algorithmica 2015; Auer et al., Algorithmica 2016]. One group asked explicitly whether outer 2-planar graphs can be recognized in polynomial time. Our main contribution consists of XP-algorithms for recognizing 2-layer k-planar graphs and outer k-planar graphs, which implies that both recognition problems can be solved in polynomial time for every fixed k. We complement these results by showing that recognizing 2-layer k-planar graphs is XNLP-complete and that recognizing outer k-planar graphs is XNLP-hard. This implies that both problems are W[t]-hard for every t and that it is unlikely that they admit FPT-algorithms. On the other hand, we present an FPT-algorithm for recognizing 2-layer k-planar graphs where the order of the vertices on one layer is specified.

Cite as

Yasuaki Kobayashi, Yuto Okada, and Alexander Wolff. Recognizing 2-Layer and Outer k-Planar Graphs. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 65:1-65:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{kobayashi_et_al:LIPIcs.SoCG.2025.65,
  author =	{Kobayashi, Yasuaki and Okada, Yuto and Wolff, Alexander},
  title =	{{Recognizing 2-Layer and Outer k-Planar Graphs}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{65:1--65:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.65},
  URN =		{urn:nbn:de:0030-drops-232170},
  doi =		{10.4230/LIPIcs.SoCG.2025.65},
  annote =	{Keywords: 2-layer k-planar graphs, outer k-planar graphs, recognition algorithms, local crossing number, bandwidth, FPT, XNLP, XP, W\lbrackt\rbrack}
}
Document
Academic Track
A View on Vulnerabilites: The Security Challenges of XAI (Academic Track)

Authors: Elisabeth Pachl, Fabian Langer, Thora Markert, and Jeanette Miriam Lorenz

Published in: OASIcs, Volume 126, Symposium on Scaling AI Assessments (SAIA 2024)


Abstract
Modern deep learning methods have long been considered as black-boxes due to their opaque decision-making processes. Explainable Artificial Intelligence (XAI), however, has turned the tables: it provides insight into how these models work, promoting transparency that is crucial for accountability. Yet, recent developments in adversarial machine learning have highlighted vulnerabilities in XAI methods, raising concerns about security, reliability and trustworthiness, particularly in sensitive areas like healthcare and autonomous systems. Awareness of the potential risks associated with XAI is needed as its adoption increases, driven in part by the need to enhance compliance to regulations. This survey provides a holistic perspective on the security and safety landscape surrounding XAI, categorizing research on adversarial attacks against XAI and the misuse of explainability to enhance attacks on AI systems, such as evasion and privacy breaches. Our contribution includes identifying current insecurities in XAI and outlining future research directions in adversarial XAI. This work serves as an accessible foundation and outlook to recognize potential research gaps and define future directions. It identifies data modalities, such as time-series or graph data, and XAI methods that have not been extensively investigated for vulnerabilities in current research.

Cite as

Elisabeth Pachl, Fabian Langer, Thora Markert, and Jeanette Miriam Lorenz. A View on Vulnerabilites: The Security Challenges of XAI (Academic Track). In Symposium on Scaling AI Assessments (SAIA 2024). Open Access Series in Informatics (OASIcs), Volume 126, pp. 12:1-12:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{pachl_et_al:OASIcs.SAIA.2024.12,
  author =	{Pachl, Elisabeth and Langer, Fabian and Markert, Thora and Lorenz, Jeanette Miriam},
  title =	{{A View on Vulnerabilites: The Security Challenges of XAI}},
  booktitle =	{Symposium on Scaling AI Assessments (SAIA 2024)},
  pages =	{12:1--12:23},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-357-7},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{126},
  editor =	{G\"{o}rge, Rebekka and Haedecke, Elena and Poretschkin, Maximilian and Schmitz, Anna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SAIA.2024.12},
  URN =		{urn:nbn:de:0030-drops-227523},
  doi =		{10.4230/OASIcs.SAIA.2024.12},
  annote =	{Keywords: Explainability, XAI, Transparency, Adversarial Machine Learning, Security, Vulnerabilities}
}
Document
Resource Paper
Whelk: An OWL EL+RL Reasoner Enabling New Use Cases

Authors: James P. Balhoff and Christopher J. Mungall

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2


Abstract
Many tasks in the biosciences rely on reasoning with large OWL terminologies (Tboxes), often combined with even larger databases. In particular, a common task is retrieval queries that utilize relational expressions; for example, “find all genes expressed in the brain or any part of the brain”. Automated reasoning on these ontologies typically relies on scalable reasoners targeting the EL subset of OWL, such as ELK. While the introduction of ELK has been transformative in the incorporation of reasoning into bio-ontology quality control and production pipelines, we have encountered limitations when applying it to use cases involving high throughput query answering or reasoning about datasets describing instances (Aboxes). Whelk is a fast OWL reasoner for combined EL+RL reasoning. As such, it is particularly useful for many biological ontology tasks, particularly those characterized by large Tboxes using the EL subset of OWL, combined with Aboxes targeting the RL subset of OWL. Whelk is implemented in Scala and utilizes immutable functional data structures, which provides advantages when performing incremental or dynamic reasoning tasks. Whelk supports querying complex class expressions at a substantially greater rate than ELK, and can answer queries or perform incremental reasoning tasks in parallel, enabling novel applications of OWL reasoning.

Cite as

James P. Balhoff and Christopher J. Mungall. Whelk: An OWL EL+RL Reasoner Enabling New Use Cases. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{balhoff_et_al:TGDK.2.2.7,
  author =	{Balhoff, James P. and Mungall, Christopher J.},
  title =	{{Whelk: An OWL EL+RL Reasoner Enabling New Use Cases}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{7:1--7:17},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.7},
  URN =		{urn:nbn:de:0030-drops-225918},
  doi =		{10.4230/TGDK.2.2.7},
  annote =	{Keywords: Web Ontology Language, OWL, Semantic Web, ontology, reasoner}
}
Document
A Strong Direct Sum Theorem for Distributional Query Complexity

Authors: Guy Blanc, Caleb Koch, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)


Abstract
Consider the expected query complexity of computing the k-fold direct product f^{⊗ k} of a function f to error ε with respect to a distribution μ^k. One strategy is to sequentially compute each of the k copies to error ε/k with respect to μ and apply the union bound. We prove a strong direct sum theorem showing that this naive strategy is essentially optimal. In particular, computing a direct product necessitates a blowup in both query complexity and error. Strong direct sum theorems contrast with results that only show a blowup in query complexity or error but not both. There has been a long line of such results for distributional query complexity, dating back to (Impagliazzo, Raz, Wigderson 1994) and (Nisan, Rudich, Saks 1994), but a strong direct sum theorem that holds for all functions in the standard query model had been elusive. A key idea in our work is the first use of the Hardcore Theorem (Impagliazzo 1995) in the context of query complexity. We prove a new resilience lemma that accompanies it, showing that the hardcore of f^{⊗k} is likely to remain dense under arbitrary partitions of the input space.

Cite as

Guy Blanc, Caleb Koch, Carmen Strassle, and Li-Yang Tan. A Strong Direct Sum Theorem for Distributional Query Complexity. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 16:1-16:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.CCC.2024.16,
  author =	{Blanc, Guy and Koch, Caleb and Strassle, Carmen and Tan, Li-Yang},
  title =	{{A Strong Direct Sum Theorem for Distributional Query Complexity}},
  booktitle =	{39th Computational Complexity Conference (CCC 2024)},
  pages =	{16:1--16:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-331-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{300},
  editor =	{Santhanam, Rahul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.16},
  URN =		{urn:nbn:de:0030-drops-204123},
  doi =		{10.4230/LIPIcs.CCC.2024.16},
  annote =	{Keywords: Query complexity, direct product theorem, hardcore theorem}
}
Document
Position
Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Large Language Models (LLMs) have taken Knowledge Representation - and the world - by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.

Cite as

Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. Large Language Models and Knowledge Graphs: Opportunities and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 2:1-2:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{pan_et_al:TGDK.1.1.2,
  author =	{Pan, Jeff Z. and Razniewski, Simon and Kalo, Jan-Christoph and Singhania, Sneha and Chen, Jiaoyan and Dietze, Stefan and Jabeen, Hajira and Omeliyanenko, Janna and Zhang, Wen and Lissandrini, Matteo and Biswas, Russa and de Melo, Gerard and Bonifati, Angela and Vakaj, Edlira and Dragoni, Mauro and Graux, Damien},
  title =	{{Large Language Models and Knowledge Graphs: Opportunities and Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:38},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.2},
  URN =		{urn:nbn:de:0030-drops-194766},
  doi =		{10.4230/TGDK.1.1.2},
  annote =	{Keywords: Large Language Models, Pre-trained Language Models, Knowledge Graphs, Ontology, Retrieval Augmented Language Models}
}
Document
Vision
Knowledge Engineering Using Large Language Models

Authors: Bradley P. Allen, Lise Stork, and Paul Groth

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Knowledge engineering is a discipline that focuses on the creation and maintenance of processes that generate and apply knowledge. Traditionally, knowledge engineering approaches have focused on knowledge expressed in formal languages. The emergence of large language models and their capabilities to effectively work with natural language, in its broadest sense, raises questions about the foundations and practice of knowledge engineering. Here, we outline the potential role of LLMs in knowledge engineering, identifying two central directions: 1) creating hybrid neuro-symbolic knowledge systems; and 2) enabling knowledge engineering in natural language. Additionally, we formulate key open research questions to tackle these directions.

Cite as

Bradley P. Allen, Lise Stork, and Paul Groth. Knowledge Engineering Using Large Language Models. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{allen_et_al:TGDK.1.1.3,
  author =	{Allen, Bradley P. and Stork, Lise and Groth, Paul},
  title =	{{Knowledge Engineering Using Large Language Models}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:19},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.3},
  URN =		{urn:nbn:de:0030-drops-194777},
  doi =		{10.4230/TGDK.1.1.3},
  annote =	{Keywords: knowledge engineering, large language models}
}
Document
Certification with an NP Oracle

Authors: Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)


Abstract
In the certification problem, the algorithm is given a function f with certificate complexity k and an input x^⋆, and the goal is to find a certificate of size ≤ poly(k) for f’s value at x^⋆. This problem is in NP^NP, and assuming 𝖯 ≠ NP, is not in 𝖯. Prior works, dating back to Valiant in 1984, have therefore sought to design efficient algorithms by imposing assumptions on f such as monotonicity. Our first result is a BPP^NP algorithm for the general problem. The key ingredient is a new notion of the balanced influence of variables, a natural variant of influence that corrects for the bias of the function. Balanced influences can be accurately estimated via uniform generation, and classic BPP^NP algorithms are known for the latter task. We then consider certification with stricter instance-wise guarantees: for each x^⋆, find a certificate whose size scales with that of the smallest certificate for x^⋆. In sharp contrast with our first result, we show that this problem is NP^NP-hard even to approximate. We obtain an optimal inapproximability ratio, adding to a small handful of problems in the higher levels of the polynomial hierarchy for which optimal inapproximability is known. Our proof involves the novel use of bit-fixing dispersers for gap amplification.

Cite as

Guy Blanc, Caleb Koch, Jane Lange, Carmen Strassle, and Li-Yang Tan. Certification with an NP Oracle. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ITCS.2023.18,
  author =	{Blanc, Guy and Koch, Caleb and Lange, Jane and Strassle, Carmen and Tan, Li-Yang},
  title =	{{Certification with an NP Oracle}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.18},
  URN =		{urn:nbn:de:0030-drops-175217},
  doi =		{10.4230/LIPIcs.ITCS.2023.18},
  annote =	{Keywords: Certificate complexity, Boolean functions, polynomial hierarchy, hardness of approximation}
}
Document
Track A: Algorithms, Complexity and Games
Reconstructing Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
We give the first reconstruction algorithm for decision trees: given queries to a function f that is opt-close to a size-s decision tree, our algorithm provides query access to a decision tree T where: - T has size S := s^O((log s)²/ε³); - dist(f,T) ≤ O(opt)+ε; - Every query to T is answered with poly((log s)/ε)⋅ log n queries to f and in poly((log s)/ε)⋅ n log n time. This yields a tolerant tester that distinguishes functions that are close to size-s decision trees from those that are far from size-S decision trees. The polylogarithmic dependence on s in the efficiency of our tester is exponentially smaller than that of existing testers. Since decision tree complexity is well known to be related to numerous other boolean function properties, our results also provide a new algorithm for reconstructing and testing these properties.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Reconstructing Decision Trees. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 24:1-24:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2022.24,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Reconstructing Decision Trees}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{24:1--24:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.24},
  URN =		{urn:nbn:de:0030-drops-163653},
  doi =		{10.4230/LIPIcs.ICALP.2022.24},
  annote =	{Keywords: Property reconstruction, property testing, tolerant testing, decision trees}
}
Document
On Comparable Box Dimension

Authors: Zdeněk Dvořák, Daniel Gonçalves, Abhiruk Lahiri, Jane Tan, and Torsten Ueckerdt

Published in: LIPIcs, Volume 224, 38th International Symposium on Computational Geometry (SoCG 2022)


Abstract
Two boxes in ℝ^d are comparable if one of them is a subset of a translation of the other one. The comparable box dimension of a graph G is the minimum integer d such that G can be represented as a touching graph of comparable axis-aligned boxes in ℝ^d. We show that proper minor-closed classes have bounded comparable box dimension and explore further properties of this notion.

Cite as

Zdeněk Dvořák, Daniel Gonçalves, Abhiruk Lahiri, Jane Tan, and Torsten Ueckerdt. On Comparable Box Dimension. In 38th International Symposium on Computational Geometry (SoCG 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 224, pp. 38:1-38:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{dvorak_et_al:LIPIcs.SoCG.2022.38,
  author =	{Dvo\v{r}\'{a}k, Zden\v{e}k and Gon\c{c}alves, Daniel and Lahiri, Abhiruk and Tan, Jane and Ueckerdt, Torsten},
  title =	{{On Comparable Box Dimension}},
  booktitle =	{38th International Symposium on Computational Geometry (SoCG 2022)},
  pages =	{38:1--38:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-227-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{224},
  editor =	{Goaoc, Xavier and Kerber, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2022.38},
  URN =		{urn:nbn:de:0030-drops-160461},
  doi =		{10.4230/LIPIcs.SoCG.2022.38},
  annote =	{Keywords: geometric graphs, minor-closed graph classes, treewidth fragility}
}
Document
RANDOM
Decision Tree Heuristics Can Fail, Even in the Smoothed Setting

Authors: Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996). Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets f that are k-juntas, they showed that these heuristics successfully learn f with depth-k decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-k decision trees. We provide a counterexample to this conjecture: we construct targets that are depth-k decision trees and show that even in the smoothed setting, these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to k-juntas, for which these heuristics build trees of depth 2^{Ω(k)} before achieving high accuracy.

Cite as

Guy Blanc, Jane Lange, Mingda Qiao, and Li-Yang Tan. Decision Tree Heuristics Can Fail, Even in the Smoothed Setting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.APPROX/RANDOM.2021.45,
  author =	{Blanc, Guy and Lange, Jane and Qiao, Mingda and Tan, Li-Yang},
  title =	{{Decision Tree Heuristics Can Fail, Even in the Smoothed Setting}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{45:1--45:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  URN =		{urn:nbn:de:0030-drops-147386},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.45},
  annote =	{Keywords: decision trees, learning theory, smoothed analysis}
}
Document
Track A: Algorithms, Complexity and Games
Learning Stochastic Decision Trees

Authors: Guy Blanc, Jane Lange, and Li-Yang Tan

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an η-corrupted set of uniform random samples labeled by a size-s stochastic decision tree, our algorithm runs in time n^{O(log(s/ε)/ε²)} and returns a hypothesis with error within an additive 2η + ε of the Bayes optimal. An additive 2η is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of O(η) + ε was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

Cite as

Guy Blanc, Jane Lange, and Li-Yang Tan. Learning Stochastic Decision Trees. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{blanc_et_al:LIPIcs.ICALP.2021.30,
  author =	{Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  title =	{{Learning Stochastic Decision Trees}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{30:1--30:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.30},
  URN =		{urn:nbn:de:0030-drops-140994},
  doi =		{10.4230/LIPIcs.ICALP.2021.30},
  annote =	{Keywords: Learning theory, decision trees, proper learning algorithms, adversarial noise}
}
  • Refine by Type
  • 16 Document/PDF
  • 9 Document/HTML

  • Refine by Publication Year
  • 3 2026
  • 3 2025
  • 2 2024
  • 3 2023
  • 2 2022
  • Show More...

  • Refine by Author
  • 7 Blanc, Guy
  • 7 Lange, Jane
  • 7 Tan, Li-Yang
  • 3 Koch, Caleb
  • 3 Strassle, Carmen
  • Show More...

  • Refine by Series/Journal
  • 11 LIPIcs
  • 2 OASIcs
  • 3 TGDK

  • Refine by Classification
  • 2 Computing methodologies → Machine learning
  • 2 Computing methodologies → Natural language processing
  • 2 Theory of computation → Boolean function learning
  • 2 Theory of computation → Computational complexity and cryptography
  • 1 Applied computing → Life and medical sciences
  • Show More...

  • Refine by Keyword
  • 3 decision trees
  • 2 Learning theory
  • 2 PAC learning
  • 1 2-layer k-planar graphs
  • 1 Adversarial Machine Learning
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail