DROPS

Document

Survey

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

Research

DOI: 10.4230/TGDK.3.2.3

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ESA.2025.2

Securing Dynamic Data: A Primer on Differentially Private Data Structures (Invited Talk)

Authors: Monika Henzinger and Roodabeh Safavi

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We give an introduction into differential privacy in the dynamic setting, called the continual observation setting.

Cite as

Monika Henzinger and Roodabeh Safavi. Securing Dynamic Data: A Primer on Differentially Private Data Structures (Invited Talk). In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{henzinger_et_al:LIPIcs.ESA.2025.2,
  author =	{Henzinger, Monika and Safavi, Roodabeh},
  title =	{{Securing Dynamic Data: A Primer on Differentially Private Data Structures}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{2:1--2:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.2},
  URN =		{urn:nbn:de:0030-drops-244702},
  doi =		{10.4230/LIPIcs.ESA.2025.2},
  annote =	{Keywords: Differential privacy, continual observation}
}

Document

DOI: 10.4230/LIPIcs.ITP.2025.14

Canonical for Automated Theorem Proving in Lean

Authors: Chase Norman and Jeremy Avigad

Published in: LIPIcs, Volume 352, 16th International Conference on Interactive Theorem Proving (ITP 2025)

Abstract

Canonical is a solver for type inhabitation in dependent type theory, that is, the problem of producing a term of a given type. We present a Lean tactic which invokes Canonical to generate proof terms and synthesize programs. The tactic supports higher-order and dependently-typed goals, structural recursion over indexed inductive types, and definitional equality. Canonical finds proofs for 84% of Natural Number Game problems in 51 seconds total.

Cite as

Chase Norman and Jeremy Avigad. Canonical for Automated Theorem Proving in Lean. In 16th International Conference on Interactive Theorem Proving (ITP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 352, pp. 14:1-14:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{norman_et_al:LIPIcs.ITP.2025.14,
  author =	{Norman, Chase and Avigad, Jeremy},
  title =	{{Canonical for Automated Theorem Proving in Lean}},
  booktitle =	{16th International Conference on Interactive Theorem Proving (ITP 2025)},
  pages =	{14:1--14:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-396-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{352},
  editor =	{Forster, Yannick and Keller, Chantal},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2025.14},
  URN =		{urn:nbn:de:0030-drops-246128},
  doi =		{10.4230/LIPIcs.ITP.2025.14},
  annote =	{Keywords: Automated Reasoning, Interactive Theorem Proving, Dependent Type Theory, Inhabitation, Unification, Program Synthesis, Formal Methods}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.6

Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support

Authors: Kaisheng Li and Richard S. Whittle

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

We propose a unified framework for an Earth‑independent AI system that provides explainable, context‑aware decision support for EVA mission planning by integrating six core components: a fine‑tuned EVA domain LLM, a retrieval‑augmented knowledge base, a short-term memory store, physical simulation models, an agentic orchestration layer, and a multimodal user interface. To ground our design, we analyze the current roles and substitution potential of the Mission Control Center - identifying which procedural and analytical functions can be automated onboard while preserving human oversight for experiential and strategic tasks. Building on this framework, we introduce RASAGE (Retrieval & Simulation Augmented Guidance Agent for Exploration), a proof‑of‑concept toolset that combines Microsoft Phi‑4‑mini‑instruct with a FAISS (Facebook AI Similarity Search)‑powered EVA knowledge base and custom A* path planning and hypogravity metabolic models to generate grounded, traceable EVA plans. We outline a staged validation strategy to evaluate improvements in route efficiency, metabolic prediction accuracy, anomaly response effectiveness, and crew trust under realistic communication delays. Our findings demonstrate the feasibility of replicating key Mission Control functions onboard, enhancing crew autonomy, reducing cognitive load, and improving safety for deep‑space exploration missions.

Cite as

Kaisheng Li and Richard S. Whittle. Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{li_et_al:OASIcs.SpaceCHI.2025.6,
  author =	{Li, Kaisheng and Whittle, Richard S.},
  title =	{{Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{6:1--6:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239967},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.6},
  annote =	{Keywords: Human-AI Interaction for Space Exploration, Extravehicular Activities, Cognitive load and Human Performance Issues, Human Systems Exploration, Lunar Exploration, LLM}
}

@InProceedings{li_et_al:OASIcs.SpaceCHI.2025.6,
  author =	{Li, Kaisheng and Whittle, Richard S.},
  title =	{{Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{6:1--6:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239967},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.6},
  annote =	{Keywords: Human-AI Interaction for Space Exploration, Extravehicular Activities, Cognitive load and Human Performance Issues, Human Systems Exploration, Lunar Exploration, LLM}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.29

Gaze Beyond Limits: Integrating Eye-Tracking and Augmented Reality for Next-Generation Spacesuit Interaction

Authors: Jiayu He, Yifan Li, Oliver R. Runswick, Peter D. Hodkinson, Jarle Steinberg, Felix Gorbatsevich, and Yang Gao

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

Extravehicular activities (EVAs) are increasingly frequent in human spaceflight, particularly in spacecraft maintenance, scientific research, and planetary exploration. Spacesuits are essential for sustaining astronauts in the harsh environment of space, making their design a key factor in the success of EVA missions. The development of spacesuit technology has traditionally been driven by highly engineered solutions focused on life support, mission adaptability and operational efficiency. Modern spacesuits prioritize maintaining optimal internal temperature, humidity and pressure, as well as withstanding extreme temperature fluctuations and providing robust protection against micrometeoroid impacts and space debris. However, their bulkiness and rigidity impose significant physical strain on astronauts, reducing mobility and dexterity, particularly in tasks requiring fine motor control. The restricted field of view further complicates situational awareness, increasing the cognitive load during high-precision operations. While traditional spacesuits support basic EVA tasks, future space exploration shifting toward long-duration lunar and Martian surface missions demand more adaptive, intelligent, and astronaut-centric designs to overcome current constraints. To explore a next-generation spacesuit, this paper proposed an in-process eye-tracking embedded Augmented Reality (AR) Spacesuit System to enhance astronaut-environment interactions. By leveraging Segment-Anything Models (SAM) and Vision-Language Models (VLMs), we demonstrate a four-step approach to enable top-down gaze detection to minimize erroneous fixation data, gaze-based segmentation of objects of interest, real-time contextual assistance via AR overlays and hands-free operation within the spacesuit. This approach enhances real-time situational awareness and improves EVA task efficiency. We conclude with an exploration of the AR Helmet System’s potential in revolutionizing human-space interaction paradigms for future long-duration deep-space missions and discuss the further optimization of eye-tracking interactions using VLMs to predict astronaut intent and highlight relevant objects preemptively.

Cite as

Jiayu He, Yifan Li, Oliver R. Runswick, Peter D. Hodkinson, Jarle Steinberg, Felix Gorbatsevich, and Yang Gao. Gaze Beyond Limits: Integrating Eye-Tracking and Augmented Reality for Next-Generation Spacesuit Interaction. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 29:1-29:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{he_et_al:OASIcs.SpaceCHI.2025.29,
  author =	{He, Jiayu and Li, Yifan and Runswick, Oliver R. and Hodkinson, Peter D. and Steinberg, Jarle and Gorbatsevich, Felix and Gao, Yang},
  title =	{{Gaze Beyond Limits: Integrating Eye-Tracking and Augmented Reality for Next-Generation Spacesuit Interaction}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{29:1--29:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.29},
  URN =		{urn:nbn:de:0030-drops-240197},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.29},
  annote =	{Keywords: Augmented Reality (AR), Eye-Tracking, Cognitive Load/Workload, Segment Anything Model (SAM), Visual Language Models (VLMs)}
}

@InProceedings{he_et_al:OASIcs.SpaceCHI.2025.29,
  author =	{He, Jiayu and Li, Yifan and Runswick, Oliver R. and Hodkinson, Peter D. and Steinberg, Jarle and Gorbatsevich, Felix and Gao, Yang},
  title =	{{Gaze Beyond Limits: Integrating Eye-Tracking and Augmented Reality for Next-Generation Spacesuit Interaction}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{29:1--29:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.29},
  URN =		{urn:nbn:de:0030-drops-240197},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.29},
  annote =	{Keywords: Augmented Reality (AR), Eye-Tracking, Cognitive Load/Workload, Segment Anything Model (SAM), Visual Language Models (VLMs)}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2025.28

Implications of Better PRGs for Permutation Branching Programs

Authors: Dean Doron and William M. Hoza

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)

Abstract

We study the challenge of derandomizing constant-width standard-order read-once branching programs (ROBPs). Let c ∈ [1, 2) be any constant. We prove that if there are explicit pseudorandom generators (PRGs) for width-6 length-n permutation ROBPs with error 1/n and seed length Õ(log^c n), then there are explicit hitting set generators (HSGs) for width-4 length-n ROBPs with threshold 1/polylog(n) and seed length Õ(log^c n). For context, there are known explicit PRGs that fool constant-width permutation ROBPs with error ε and seed length O(log(n)⋅log(1/ε)) (Koucký, Nimbhorkar, and Pudlák STOC 2011; De CCC 2011; Steinke ECCC 2012). When ε = 1/n, there are known constructions of weighted pseudorandom generators (WPRGs) that fool polynomial-width permutation ROBPs with seed length Õ(log^{3/2} n) (Pyne and Vadhan CCC 2021; Chen, Hoza, Lyu, Tal, and Wu FOCS 2023; Chattopadhyay and Liao ITCS 2024), but unweighted PRGs with seed length o(log² n) remain elusive. Meanwhile, for width-4 ROBPs, there are no known explicit PRGs, WPRGs, or HSGs with seed length o(log²n). Our reduction can be divided into two parts. First, we show that explicit low-error PRGs for width-6 permutation ROBPs with seed length Õ(log^c n) would imply explicit low-error PRGs for width-3 ROBPs with seed length Õ(log^c n). This would improve Meka, Reingold, and Tal’s PRG (STOC 2019), which has seed length o(log²n) only when the error parameter is relatively large. Second, we show that for any w, n, s, and ε, an explicit PRG for width-w ROBPs with error 0.01/n and seed length s would imply an explicit ε-HSG for width-(w + 1) ROBPs with seed length O(s + log(n)⋅log(1/ε)).

Cite as

Dean Doron and William M. Hoza. Implications of Better PRGs for Permutation Branching Programs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 28:1-28:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{doron_et_al:LIPIcs.APPROX/RANDOM.2025.28,
  author =	{Doron, Dean and Hoza, William M.},
  title =	{{Implications of Better PRGs for Permutation Branching Programs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{28:1--28:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.28},
  URN =		{urn:nbn:de:0030-drops-243946},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.28},
  annote =	{Keywords: hitting set generators, pseudorandom generators, read-once branching programs}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2025.67

On Sums of INW Pseudorandom Generators

Authors: William M. Hoza and Zelin Lv

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)

Abstract

We study a new approach for constructing pseudorandom generators (PRGs) that fool constant-width standard-order read-once branching programs (ROBPs). Let X be the n-bit output distribution of the INW PRG (Impagliazzo, Nisan, and Wigderson, STOC 1994), instantiated using expansion parameter λ. We prove that the bitwise XOR of t independent copies of X fools width-w programs with error n^{log(w + 1)} ⋅ (λ⋅log n)^t. Notably, this error bound is meaningful even for relatively large values of λ such as λ = 1/O(log n). Admittedly, our analysis does not yet imply any improvement in the bottom-line overall seed length required for fooling such programs - it just gives a new way of re-proving the well-known O(log² n) bound. Furthermore, we prove that this shortcoming is not an artifact of our analysis, but rather is an intrinsic limitation of our "XOR of INW" approach. That is, no matter how many copies of the INW generator we XOR together, and no matter how we set the expansion parameters, if the generator fools width-3 programs and the proof of correctness does not use any properties of the expander graphs except their spectral expansion, then we prove that the seed length of the generator is inevitably Ω(log² n). Still, we hope that our work might be a step toward constructing near-optimal PRGs fooling constant-width ROBPs. We suggest that one could try running the INW PRG on t correlated seeds, sampled via another PRG, and taking the bitwise XOR of the outputs.

Cite as

William M. Hoza and Zelin Lv. On Sums of INW Pseudorandom Generators. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 67:1-67:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hoza_et_al:LIPIcs.APPROX/RANDOM.2025.67,
  author =	{Hoza, William M. and Lv, Zelin},
  title =	{{On Sums of INW Pseudorandom Generators}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{67:1--67:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.67},
  URN =		{urn:nbn:de:0030-drops-244330},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.67},
  annote =	{Keywords: INW generator, pseudorandomness, space-bounded computation, XOR Lemmas}
}

Document

DOI: 10.4230/LIPIcs.ITC.2025.1

Amortized Locally Decodable Codes for Insertions and Deletions

Authors: Jeremiah Blocki and Justin Zhang

Published in: LIPIcs, Volume 343, 6th Conference on Information-Theoretic Cryptography (ITC 2025)

Abstract

Locally Decodable Codes (LDCs) are error correcting codes which permit the recovery of any single message symbol with a low number of queries to the codeword (the locality). Traditional LDC tradeoffs between the rate, locality, and error tolerance are undesirable even in relaxed settings where the encoder/decoder share randomness or where the channel is resource-bounded. Recent work by Blocki and Zhang initiated the study of Hamming amortized Locally Decodable Codes (aLDCs), which allow the local decoder to amortize their number of queries over the recovery of a small subset of message symbols. Surprisingly, Blocki and Zhang construct asymptotically ideal (constant rate, constant amortized locality, and constant error tolerance) Hamming aLDCs in private-key and resource-bounded settings. While this result overcame previous barriers and impossibility results for Hamming LDCs, it is not clear whether the techniques extend to Insdel LDCs. Constructing Insdel LDCs which are resilient to insertion and/or deletion errors is known to be even more challenging. For example, Gupta (STOC'24) proved that no Insdel LDC with constant rate and error tolerance exists even in relaxed settings. Our first contribution is to provide a Hamming-to-Insdel compiler which transforms any amortized Hamming LDC that satisfies a particular property (consecutive interval querying) to amortized Insdel LDC while asymptotically preserving the rate, error tolerance and amortized locality. Prior Hamming-to-Insdel compilers of Ostrovsky and Paskin-Cherniavsky (ICITS'15) and Block et al. (FSTTCS'20) worked for arbitrary Hamming LDCs, but incurred an undesirable polylogarithmic blow-up in the locality. Our second contribution is a construction of an ideal amortized Hamming LDC which satisfies our special property (consecutive interval querying) in the relaxed settings where the sender/receiver share randomness or where the channel is resource bounded. Taken together, we obtain ideal Insdel aLDCs in private-key and resource-bounded settings with constant amortized locality, constant rate and constant error tolerance. This result is surprising in light of Gupta’s (STOC'24) impossibility result which demonstrates a strong separation between locality and amortized locality for Insdel LDCs.

Cite as

Jeremiah Blocki and Justin Zhang. Amortized Locally Decodable Codes for Insertions and Deletions. In 6th Conference on Information-Theoretic Cryptography (ITC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 343, pp. 1:1-1:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{blocki_et_al:LIPIcs.ITC.2025.1,
  author =	{Blocki, Jeremiah and Zhang, Justin},
  title =	{{Amortized Locally Decodable Codes for Insertions and Deletions}},
  booktitle =	{6th Conference on Information-Theoretic Cryptography (ITC 2025)},
  pages =	{1:1--1:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-385-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{343},
  editor =	{Gilboa, Niv},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITC.2025.1},
  URN =		{urn:nbn:de:0030-drops-243518},
  doi =		{10.4230/LIPIcs.ITC.2025.1},
  annote =	{Keywords: Amortized Locally Decodable Codes, Insertion and Deletion Errors}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.9

On String and Graph Sanitization

Authors: Giulia Bernardini, Huiping Chen, Grigorios Loukides, and Solon P. Pissis

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

Data sanitization is a process that conceals sensitive patterns from a given dataset. A secondary goal is to not severely harm the utility of the underlying data along this process. We survey some recent advancements on two related data sanitization topics: string and graph sanitization. In particular, we highlight the important contributions of our friend Prof. Roberto Grossi along this journey.

Cite as

Giulia Bernardini, Huiping Chen, Grigorios Loukides, and Solon P. Pissis. On String and Graph Sanitization. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 9:1-9:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bernardini_et_al:OASIcs.Grossi.9,
  author =	{Bernardini, Giulia and Chen, Huiping and Loukides, Grigorios and Pissis, Solon P.},
  title =	{{On String and Graph Sanitization}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{9:1--9:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.9},
  URN =		{urn:nbn:de:0030-drops-238086},
  doi =		{10.4230/OASIcs.Grossi.9},
  annote =	{Keywords: data privacy, data sanitization, string algorithms, graph algorithms}
}

Document

DOI: 10.4230/LIPIcs.CP.2025.3

Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search

Authors: Valentin Antuori, Damien T. Wojtowicz, and Emmanuel Hebrard

Published in: LIPIcs, Volume 340, 31st International Conference on Principles and Practice of Constraint Programming (CP 2025)

Abstract

The increasing hunger for remote sensing data fuels a boom in satellite imagery, leading to larger agile Earth observation satellite (AEOS) constellations. Therefore, instances of the AEOS scheduling problem (AEOSSP) has become harder to solve. As most existing approaches to solve AEOSSP are designed for a single spacecraft or smaller constellations in mind, they are not tailored to the need of our industrial partner that is about to launch a constellation of 20 AEOSs. Hence, we designed a local search solver able to schedule observations and downloads at such a scale. It relies on solving a series of sub-problems as travelling salesman problem with time windows (TSPTW), first greedily, then using a CP-SAT exact solver in order to find a solution when the greedy insertion fails. Lastly, it schedules downloads and enforces memory constraints with greedy algorithms. Experiments were carried out on instances from the literature as well as generated instances from a simulator we designed. Our experiments show that using CP to solve the sub-problem significantly improve the solutions, and overall our method is slightly better than state-of-the-art approaches.

Cite as

Valentin Antuori, Damien T. Wojtowicz, and Emmanuel Hebrard. Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search. In 31st International Conference on Principles and Practice of Constraint Programming (CP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 340, pp. 3:1-3:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{antuori_et_al:LIPIcs.CP.2025.3,
  author =	{Antuori, Valentin and Wojtowicz, Damien T. and Hebrard, Emmanuel},
  title =	{{Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search}},
  booktitle =	{31st International Conference on Principles and Practice of Constraint Programming (CP 2025)},
  pages =	{3:1--3:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-380-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{340},
  editor =	{de la Banda, Maria Garcia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2025.3},
  URN =		{urn:nbn:de:0030-drops-238647},
  doi =		{10.4230/LIPIcs.CP.2025.3},
  annote =	{Keywords: Local Search, Greedy Algorithms, Aerospace Applications}
}

Document

DOI: 10.4230/LIPIcs.CCC.2025.6

Near-Optimal Averaging Samplers and Matrix Samplers

Authors: Zhiyang Xun and David Zuckerman

Published in: LIPIcs, Volume 339, 40th Computational Complexity Conference (CCC 2025)

Abstract

We present the first efficient averaging sampler that achieves asymptotically optimal randomness complexity and near-optimal sample complexity. For any δ < ε and any constant α > 0, our sampler uses m + O(log (1 / δ)) random bits to output t = O((1/ε² log 1/δ)^{1 + α}) samples Z_1, … , Z_t ∈ {0, 1}^m such that for any function f: {0, 1}^m → [0, 1], Pr[|1/t∑_{i=1}^t f(Z_i) - 𝔼[f]| ≤ ε] ≥ 1 - δ. The randomness complexity is optimal up to a constant factor, and the sample complexity is optimal up to the O((1/(ε²) log 1/(δ))^α) factor. Our technique generalizes to matrix samplers. A matrix sampler is defined similarly, except that f: {0, 1}^m → ℂ^{d×d} and the absolute value is replaced by the spectral norm. Our matrix sampler achieves randomness complexity m + Õ(log(d / δ)) and sample complexity O((1/ε² log d/δ)^{1 + α}) for any constant α > 0, both near-optimal with only a logarithmic factor in randomness complexity and an additional α exponent on the sample complexity. We use known connections with randomness extractors and list-decodable codes to give applications to these objects. Specifically, we give the first extractor construction with optimal seed length up to an arbitrarily small constant factor above 1, when the min-entropy k = β n for a large enough constant β < 1. Finally, we generalize the definition of averaging sampler to any normed vector space.

Cite as

Zhiyang Xun and David Zuckerman. Near-Optimal Averaging Samplers and Matrix Samplers. In 40th Computational Complexity Conference (CCC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 339, pp. 6:1-6:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{xun_et_al:LIPIcs.CCC.2025.6,
  author =	{Xun, Zhiyang and Zuckerman, David},
  title =	{{Near-Optimal Averaging Samplers and Matrix Samplers}},
  booktitle =	{40th Computational Complexity Conference (CCC 2025)},
  pages =	{6:1--6:28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-379-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{339},
  editor =	{Srinivasan, Srikanth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2025.6},
  URN =		{urn:nbn:de:0030-drops-237001},
  doi =		{10.4230/LIPIcs.CCC.2025.6},
  annote =	{Keywords: Pseudorandomness, Averaging Samplers, Randomness Extractors}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.104

Improved Approximation Algorithms for Three-Dimensional Bin Packing

Authors: Debajyoti Kar, Arindam Khan, and Malin Rau

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

We study three fundamental three-dimensional (3D) geometric packing problems: 3D (Geometric) Bin Packing (3D-BP), 3D Strip Packing (3D-SP), and Minimum Volume Bounding Box (3D-MVBB), where given a set of 3D (rectangular) cuboids, the goal is to find an axis-aligned nonoverlapping packing of all cuboids. In 3D-BP, we need to pack the given cuboids into the minimum number of unit cube bins. In 3D-SP, we need to pack them into a 3D cuboid with a unit square base and minimum height. Finally, in 3D-MVBB, the goal is to pack into a cuboid box of minimum volume. It is NP-hard to even decide whether a set of rectangles can be packed into a unit square bin - giving an (absolute) approximation hardness of 2 for 3D-BP and 3D-SP. The previous best (absolute) approximation for all three problems is by Li and Cheng (SICOMP, 1990), who gave algorithms with approximation ratios of 13, 46/7, and 46/7+ε, respectively, for 3D-BP, 3D-SP, and 3D-MVBB. We provide improved approximation ratios of 6, 6, and 3+ε, respectively, for the three problems, for any constant ε > 0. For 3D-BP, in the asymptotic regime, Bansal, Correa, Kenyon, and Sviridenko (Math. Oper. Res., 2006) showed that there is no asymptotic polynomial-time approximation scheme (APTAS) even when all items have the same height. Caprara (Math. Oper. Res., 2008) gave an asymptotic approximation ratio of T_{∞}² + ε ≈ 2.86, where T_{∞} is the well-known Harmonic constant in Bin Packing. We provide an algorithm with an improved asymptotic approximation ratio of 3 T_{∞}/2 + ε ≈ 2.54. Further, we show that unlike 3D-BP (and 3D-SP), 3D-MVBB admits an APTAS.

Cite as

Debajyoti Kar, Arindam Khan, and Malin Rau. Improved Approximation Algorithms for Three-Dimensional Bin Packing. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 104:1-104:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kar_et_al:LIPIcs.ICALP.2025.104,
  author =	{Kar, Debajyoti and Khan, Arindam and Rau, Malin},
  title =	{{Improved Approximation Algorithms for Three-Dimensional Bin Packing}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{104:1--104:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.104},
  URN =		{urn:nbn:de:0030-drops-234814},
  doi =		{10.4230/LIPIcs.ICALP.2025.104},
  annote =	{Keywords: Approximation Algorithms, Geometric Packing, Multidimensional Packing}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.60

Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets

Authors: Roni Con, Zeyu Guo, Ray Li, and Zihan Zhang

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

In this paper, we prove that with high probability, random Reed-Solomon codes approach the half-Singleton bound - the optimal rate versus error tradeoff for linear insdel codes - with linear-sized alphabets. More precisely, we prove that, for any ε > 0 and positive integers n and k, with high probability, random Reed-Solomon codes of length n and dimension k can correct (1-ε)n-2k+1 adversarial insdel errors over alphabets of size n+2^{poly(1/ε)}k. This significantly improves upon the alphabet size demonstrated in the work of Con, Shpilka, and Tamo (IEEE TIT, 2023), who showed the existence of Reed-Solomon codes with exponential alphabet size Õ(binom(n,2k-1)²) precisely achieving the half-Singleton bound. Our methods are inspired by recent works on list-decoding Reed-Solomon codes. Brakensiek-Gopi-Makam (STOC 2023) showed that random Reed-Solomon codes are list-decodable up to capacity with exponential-sized alphabets, and Guo-Zhang (FOCS 2023) and Alrabiah-Guruswami-Li (STOC 2024) improved the alphabet-size to linear. We achieve a similar alphabet-size reduction by similarly establishing strong bounds on the probability that certain random rectangular matrices are full rank. To accomplish this in our insdel context, our proof combines the random matrix techniques from list-decoding with structural properties of Longest Common Subsequences.

Cite as

Roni Con, Zeyu Guo, Ray Li, and Zihan Zhang. Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 60:1-60:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{con_et_al:LIPIcs.ICALP.2025.60,
  author =	{Con, Roni and Guo, Zeyu and Li, Ray and Zhang, Zihan},
  title =	{{Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{60:1--60:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.60},
  URN =		{urn:nbn:de:0030-drops-234372},
  doi =		{10.4230/LIPIcs.ICALP.2025.60},
  annote =	{Keywords: coding theory, error-correcting codes, Reed-Solomon codes, insdel, insertion-deletion errors, half-Singleton bound}
}

@InProceedings{con_et_al:LIPIcs.ICALP.2025.60,
  author =	{Con, Roni and Guo, Zeyu and Li, Ray and Zhang, Zihan},
  title =	{{Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{60:1--60:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.60},
  URN =		{urn:nbn:de:0030-drops-234372},
  doi =		{10.4230/LIPIcs.ICALP.2025.60},
  annote =	{Keywords: coding theory, error-correcting codes, Reed-Solomon codes, insdel, insertion-deletion errors, half-Singleton bound}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.6

Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

Authors: Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Functionality-specific vulnerabilities, which mainly occur in Application Programming Interfaces (APIs) with specific functionalities, are crucial for software developers to detect and avoid. When detecting individual functionality-specific vulnerabilities, the existing two categories of approaches are ineffective because they consider only the API bodies and are unable to handle diverse implementations of functionality-equivalent APIs. To effectively detect functionality-specific vulnerabilities, we propose APISS, the first approach to utilize API doc strings and signatures instead of API bodies. APISS first retrieves functionality-equivalent APIs for APIs with existing vulnerabilities and then migrates Proof-of-Concepts (PoCs) of the existing vulnerabilities for newly detected vulnerable APIs. To retrieve functionality-equivalent APIs, we leverage a Large Language Model for API embedding to improve the accuracy and address the effectiveness and scalability issues suffered by the existing approaches. To migrate PoCs of the existing vulnerabilities for newly detected vulnerable APIs, we design a semi-automatic schema to substantially reduce manual costs. We conduct a comprehensive evaluation to empirically compare APISS with four state-of-the-art approaches of detecting vulnerabilities and two state-of-the-art approaches of retrieving functionality-equivalent APIs. The evaluation subjects include 180 widely used Java repositories using 10 existing vulnerabilities, along with their PoCs. The results show that APISS effectively retrieves functionality-equivalent APIs, achieving a Top-1 Accuracy of 0.81 while the best of the baselines under comparison achieves only 0.55. APISS is highly efficient: the manual costs are within 10 minutes per vulnerability and the end-to-end runtime overhead of testing one candidate API is less than 2 hours. APISS detects 179 new vulnerabilities and receives 60 new CVE IDs, bringing high value to security practice.

Cite as

Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie. Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 6:1-6:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}

41 Search Results for "Xin, Cheng"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message