DROPS

Document

Invited Talk

DOI: 10.4230/LIPIcs.ESA.2024.1

Recent Progress on Correlation Clustering: From Local Algorithms to Better Approximation Algorithms and Back (Invited Talk)

Authors: Vincent Cohen-Addad

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

Abstract

Correlation clustering is a classic model for clustering problems arising in machine learning and data mining. Given a set of data elements represented as vertices of a graph and pairwise similarity represented as edges, the goal is to find a partition of the vertex set so as to minimize the total number of edges across the parts plus the total number of non-edges within the parts. Introduced in the early 2000s [Bansal et al., 2004], correlation clustering has received a large amount of attention through the years. A natural linear programming relaxation was shown to have an integrality gap of at least 2 and at most 2.5 [Ailon et al., 2008] in 2005, and in 2015 at most 2.06 [Chawla et al., 2015]. In 2021, motivated by large-scale application new structural insights allowed to derive a simple, practical algorithm that achieved an O(1)-approximation in a variety of models (Massively Parallel, Sublinear, Streaming or Differentially-private) [Vincent Cohen{-}Addad et al., 2021; Cohen-Addad et al., 2022]. These new insights turned out to be a key building block in designing better algorithms: It serves as a pre-clustering of the input graph that enables algorithm with approximation guarantees significantly better than 2 [Vincent Cohen{-}Addad et al., 2023; Vincent Cohen{-}Addad et al., 2022]. It is a key component in the new algorithm that achieves a 1.44-approximation [Nairen Cao et al., 2024] and in the new local-search based 1.84-approximation for the Massively Parallel, Sublinear, and Streaming models [Vincent Cohen{-}Addad et al., 2024]. This talk will review the above recent development and what are the main open research directions. A collection of joint works with Nairen Cao, Silvio Lattanzi, Euiwoong Lee, Shi Li, David Rasmussen Lolck, Slobodan Mitrovic, Alantha Newman, Ashkan Norouzi-Fard, Nikos Parotsidis, Marcin Pilipczuk, Jakub Tarnawski, Mikkel Thorup, Lukas Vogl, Shuyi Yan, Hanwen Zhang.

Cite as

Vincent Cohen-Addad. Recent Progress on Correlation Clustering: From Local Algorithms to Better Approximation Algorithms and Back (Invited Talk). In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 1:1-1:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{cohenaddad:LIPIcs.ESA.2024.1,
  author =	{Cohen-Addad, Vincent},
  title =	{{Recent Progress on Correlation Clustering: From Local Algorithms to Better Approximation Algorithms and Back}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{1:1--1:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.1},
  URN =		{urn:nbn:de:0030-drops-210728},
  doi =		{10.4230/LIPIcs.ESA.2024.1},
  annote =	{Keywords: Approximation Algorithms, Clustering, Local Model}
}

Document

DOI: 10.4230/LIPIcs.AFT.2024.7

DeFiAligner: Leveraging Symbolic Analysis and Large Language Models for Inconsistency Detection in Decentralized Finance

Authors: Rundong Gan, Liyi Zhou, Le Wang, Kaihua Qin, and Xiaodong Lin

Published in: LIPIcs, Volume 316, 6th Conference on Advances in Financial Technologies (AFT 2024)

Abstract

Decentralized Finance (DeFi) has witnessed a monumental surge, reaching 53.039 billion USD in total value locked. As this sector continues to expand, ensuring the reliability of DeFi smart contracts becomes increasingly crucial. While some users are adept at reading code or the compiled bytecode to understand smart contracts, many rely on documentation. Therefore, discrepancies between the documentation and the deployed code can pose significant risks, whether these discrepancies are due to errors or intentional fraud. To tackle these challenges, we developed DeFiAligner, an end-to-end system to identify inconsistencies between documentation and smart contracts. DeFiAligner incorporates a symbolic execution tool, SEVM, which explores execution paths of on-chain binary code, recording memory and stack states. It automatically generates symbolic expressions for token balance changes and branch conditions, which, along with related project documents, are processed by LLMs. Using structured prompts, the LLMs evaluate the alignment between the symbolic expressions and the documentation. Our tests across three distinct scenarios demonstrate DeFiAligner’s capability to automate inconsistency detection in DeFi, achieving recall rates of 92% and 90% on two public datasets respectively.

Cite as

Rundong Gan, Liyi Zhou, Le Wang, Kaihua Qin, and Xiaodong Lin. DeFiAligner: Leveraging Symbolic Analysis and Large Language Models for Inconsistency Detection in Decentralized Finance. In 6th Conference on Advances in Financial Technologies (AFT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 316, pp. 7:1-7:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{gan_et_al:LIPIcs.AFT.2024.7,
  author =	{Gan, Rundong and Zhou, Liyi and Wang, Le and Qin, Kaihua and Lin, Xiaodong},
  title =	{{DeFiAligner: Leveraging Symbolic Analysis and Large Language Models for Inconsistency Detection in Decentralized Finance}},
  booktitle =	{6th Conference on Advances in Financial Technologies (AFT 2024)},
  pages =	{7:1--7:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-345-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{316},
  editor =	{B\"{o}hme, Rainer and Kiffer, Lucianna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AFT.2024.7},
  URN =		{urn:nbn:de:0030-drops-209431},
  doi =		{10.4230/LIPIcs.AFT.2024.7},
  annote =	{Keywords: Decentralized Finance Security, Large Language Models, Project Review, Symbolic Analysis, Smart Contracts}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2024.7

HOBBIT: Hashed OBject Based InTegrity

Authors: Matthias Bernad and Stefan Brunthaler

Published in: LIPIcs, Volume 313, 38th European Conference on Object-Oriented Programming (ECOOP 2024)

Abstract

C vulnerabilities usually hold verbatim for C++ programs. The counterfeit-object-oriented programming attack demonstrated that this relation is asymmetric, i.e., it only applies to C++. The problem pinpointed by this COOP attack is that C++ does not validate the integrity of its objects. By injecting malicious objects with manipulated virtual function table pointers, attackers can hijack control-flow of programs. The software security community addressed the COOP-problem in the years following its discovery, but together with the emergence of transient-execution attacks, such as Spectre, researchers also shifted their attention. We present Hobbit, a software-only solution to prevent COOP attacks by validating object integrity for virtual function pointer tables. Hobbit does not require any hardware specific features, scales to multi-million lines of C++ source code, and our LLVM-based implementation offers a configurable performance impact between 121.63% and 2.80% on compute-intensive SPEC CPU C++ benchmarks. Hobbit’s security analysis indicates strong resistance to brute forcing attacks and demonstrates additional benefits of using execute-only memory.

Cite as

Matthias Bernad and Stefan Brunthaler. HOBBIT: Hashed OBject Based InTegrity. In 38th European Conference on Object-Oriented Programming (ECOOP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 313, pp. 7:1-7:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{bernad_et_al:LIPIcs.ECOOP.2024.7,
  author =	{Bernad, Matthias and Brunthaler, Stefan},
  title =	{{HOBBIT: Hashed OBject Based InTegrity}},
  booktitle =	{38th European Conference on Object-Oriented Programming (ECOOP 2024)},
  pages =	{7:1--7:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-341-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{313},
  editor =	{Aldrich, Jonathan and Salvaneschi, Guido},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2024.7},
  URN =		{urn:nbn:de:0030-drops-208566},
  doi =		{10.4230/LIPIcs.ECOOP.2024.7},
  annote =	{Keywords: software security, code-reuse attacks, language-based security, counterfeit-object-oriented programming, object integrity, compiler security}
}

Document

DOI: 10.4230/LIPIcs.DNA.30.4

Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops

Authors: Antti Elonen and Pekka Orponen

Published in: LIPIcs, Volume 314, 30th International Conference on DNA Computing and Molecular Programming (DNA 30) (2024)

Abstract

We present a general design technique for rendering any 3D wireframe model, that is any connected graph linearly embedded in 3D space, as an RNA origami nanostructure with a minimum number of kissing loops. The design algorithm, which applies some ideas and methods from topological graph theory, produces renderings that contain at most one kissing-loop pair for many interesting model families, including for instance all fully triangulated wireframes and the wireframes of all Platonic solids. The design method is already implemented and available for use in the design tool DNAforge (https://dnaforge.org).

Cite as

Antti Elonen and Pekka Orponen. Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops. In 30th International Conference on DNA Computing and Molecular Programming (DNA 30). Leibniz International Proceedings in Informatics (LIPIcs), Volume 314, pp. 4:1-4:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{elonen_et_al:LIPIcs.DNA.30.4,
  author =	{Elonen, Antti and Orponen, Pekka},
  title =	{{Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops}},
  booktitle =	{30th International Conference on DNA Computing and Molecular Programming (DNA 30)},
  pages =	{4:1--4:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-344-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{314},
  editor =	{Seki, Shinnosuke and Stewart, Jaimie Marie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DNA.30.4},
  URN =		{urn:nbn:de:0030-drops-209325},
  doi =		{10.4230/LIPIcs.DNA.30.4},
  annote =	{Keywords: RNA origami, wireframe nanostructures, polyhedra, kissing loops, topological graph embeddings, self-assembly}
}

Document

DOI: 10.4230/LIPIcs.CP.2024.19

An Efficient Local Search Solver for Mixed Integer Programming

Authors: Peng Lin, Mengchuan Zou, and Shaowei Cai

Published in: LIPIcs, Volume 307, 30th International Conference on Principles and Practice of Constraint Programming (CP 2024)

Abstract

Mixed integer programming (MIP) is a fundamental model in operations research. Local search is a powerful method for solving hard problems, but the development of local search solvers for MIP still needs to be explored. This work develops an efficient local search solver for solving MIP, called Local-MIP. We propose two new operators for MIP to adaptively modify variables for optimizing the objective function and satisfying constraints, respectively. Furthermore, we design a new weighting scheme to dynamically balance the priority between the objective function and each constraint, and propose a two-level scoring function structure to hierarchically guide the search for high-quality feasible solutions. Experiments are conducted on seven public benchmarks to compare Local-MIP with state-of-the-art MIP solvers, which demonstrate that Local-MIP significantly outperforms CPLEX, HiGHS, SCIP and Feasibility Jump, and is competitive with the most powerful commercial solver Gurobi. Moreover, Local-MIP establishes 4 new records for MIPLIB open instances.

Cite as

Peng Lin, Mengchuan Zou, and Shaowei Cai. An Efficient Local Search Solver for Mixed Integer Programming. In 30th International Conference on Principles and Practice of Constraint Programming (CP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 307, pp. 19:1-19:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{lin_et_al:LIPIcs.CP.2024.19,
  author =	{Lin, Peng and Zou, Mengchuan and Cai, Shaowei},
  title =	{{An Efficient Local Search Solver for Mixed Integer Programming}},
  booktitle =	{30th International Conference on Principles and Practice of Constraint Programming (CP 2024)},
  pages =	{19:1--19:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-336-2},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{307},
  editor =	{Shaw, Paul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2024.19},
  URN =		{urn:nbn:de:0030-drops-207041},
  doi =		{10.4230/LIPIcs.CP.2024.19},
  annote =	{Keywords: Mixed Integer Programming, Local Search, Operator, Scoring Function}
}

Document

DOI: 10.4230/LIPIcs.WABI.2024.5

Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests

Authors: Lukas Hübner and Alexandros Stamatakis

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

Abstract

The field of population genetics attempts to advance our understanding of evolutionary processes. It has applications, for example, in medical research, wildlife conservation, and - in conjunction with recent advances in ancient DNA sequencing technology - studying human migration patterns over the past few thousand years. The basic toolbox of population genetics includes genealogical trees, which describe the shared evolutionary history among individuals of the same species. They are calculated on the basis of genetic variations. However, in recombining organisms, a single tree is insufficient to describe the evolutionary history of the whole genome. Instead, a collection of correlated trees can be used, where each describes the evolutionary history of a consecutive region of the genome. The current corresponding state of-the-art data structure, tree sequences, compresses these genealogical trees via edit operations when moving from one tree to the next along the genome instead of storing the full, often redundant, description for each tree. We propose a new data structure, genealogical forests, which compresses the set of genealogical trees into a DAG. In this DAG identical subtrees that are shared across the input trees are encoded only once, thereby allowing for straight-forward memoization of intermediate results. Additionally, we provide a C++ implementation of our proposed data structure, called gfkit, which is 2.1 to 11.2 (median 4.0) times faster than the state-of-the-art tool on empirical and simulated datasets at computing important population genetics statistics such as the Allele Frequency Spectrum, Patterson’s f, the Fixation Index, Tajima’s D, pairwise Lowest Common Ancestors, and others. On Lowest Common Ancestor queries with more than two samples as input, gfkit scales asymptotically better than the state-of-the-art, and is thus up to 990 times faster. In conclusion, our proposed data structure compresses genealogical trees by storing shared subtrees only once, thereby enabling straight-forward memoization of intermediate results, yielding a substantial runtime reduction and a potentially more intuitive data representation over the state-of-the-art. Our improvements will boost the development of novel analyses and models in the field of population genetics and increases scalability to ever-growing genomic datasets.

Cite as

Lukas Hübner and Alexandros Stamatakis. Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 5:1-5:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{hubner_et_al:LIPIcs.WABI.2024.5,
  author =	{H\"{u}bner, Lukas and Stamatakis, Alexandros},
  title =	{{Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{5:1--5:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.5},
  URN =		{urn:nbn:de:0030-drops-206499},
  doi =		{10.4230/LIPIcs.WABI.2024.5},
  annote =	{Keywords: bioinformatics, population genetics, algorithms}
}

Document

DOI: 10.4230/LIPIcs.WABI.2024.23

AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction

Authors: Adam Cicherski, Anna Lisiecka, and Norbert Dojer

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

Abstract

The success of pangenome-based approaches to genomics analysis depends largely on the existence of efficient methods for constructing pangenome graphs that are applicable to large genome collections. In the current paper we present AlfaPang, a new pangenome graph building algorithm. AlfaPang is based on a novel alignment-free approach that allows to construct pangenome graphs using significantly less computational resources than state-of-the-art tools. The code of AlfaPang is freely available at https://github.com/AdamCicherski/AlfaPang.

Cite as

Adam Cicherski, Anna Lisiecka, and Norbert Dojer. AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 23:1-23:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{cicherski_et_al:LIPIcs.WABI.2024.23,
  author =	{Cicherski, Adam and Lisiecka, Anna and Dojer, Norbert},
  title =	{{AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{23:1--23:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.23},
  URN =		{urn:nbn:de:0030-drops-206673},
  doi =		{10.4230/LIPIcs.WABI.2024.23},
  annote =	{Keywords: pangenome, variation graph, genome alignment, population genomics}
}

Document

DOI: 10.4230/LIPIcs.WABI.2024.4

MEM-Based Pangenome Indexing for k-mer Queries

Authors: Stephen Hwang, Nathaniel K. Brown, Omar Y. Ahmed, Katharine M. Jenike, Sam Kovaka, Michael C. Schatz, and Ben Langmead

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

Abstract

Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition and conservation within pangenomes have limitations. Methods based on graph pangenomes require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes based on k-mers and de Bruijn graphs are limited to answering questions at a specific substring length k. We present Maximal Exact Match Ordered (MEMO), a pangenome indexing method based on maximal exact matches (MEMs) between sequences. A single MEMO index can handle arbitrary-length queries over pangenomic windows. MEMO enables both queries that test k-mer presence/absence (membership queries) and that count the number of genomes containing k-mers in a window (conservation queries). MEMO’s index for a pangenome of 89 human autosomal haplotypes fits in 2.04 GB, 8.8× smaller than a comparable KMC3 index and 11.4× smaller than a PanKmer index. MEMO indexes can be made smaller by sacrificing some counting resolution, with our decile-resolution HPRC index reaching 0.67 GB. MEMO can conduct a conservation query for 31-mers over the human leukocyte antigen locus in 13.89 seconds, 2.5× faster than other approaches. MEMO’s small index size, lack of k-mer length dependence, and efficient queries make it a flexible tool for studying and visualizing substring conservation in pangenomes.

Cite as

Stephen Hwang, Nathaniel K. Brown, Omar Y. Ahmed, Katharine M. Jenike, Sam Kovaka, Michael C. Schatz, and Ben Langmead. MEM-Based Pangenome Indexing for k-mer Queries. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 4:1-4:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{hwang_et_al:LIPIcs.WABI.2024.4,
  author =	{Hwang, Stephen and Brown, Nathaniel K. and Ahmed, Omar Y. and Jenike, Katharine M. and Kovaka, Sam and Schatz, Michael C. and Langmead, Ben},
  title =	{{MEM-Based Pangenome Indexing for k-mer Queries}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{4:1--4:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.4},
  URN =		{urn:nbn:de:0030-drops-206482},
  doi =		{10.4230/LIPIcs.WABI.2024.4},
  annote =	{Keywords: Pangenomics, Comparative genomics, Compressed indexing}
}

Document

DOI: 10.4230/LIPIcs.CCC.2024.10

Explicit Directional Affine Extractors and Improved Hardness for Linear Branching Programs

Authors: Xin Li and Yan Zhong

Published in: LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)

Abstract

Affine extractors give some of the best-known lower bounds for various computational models, such as AC⁰ circuits, parity decision trees, and general Boolean circuits. However, they are not known to give strong lower bounds for read-once branching programs (ROBPs). In a recent work, Gryaznov, Pudlák, and Talebanfard (CCC' 22) introduced a stronger version of affine extractors known as directional affine extractors, together with a generalization of ROBPs where each node can make linear queries, and showed that the former implies strong lower bound for a certain type of the latter known as strongly read-once linear branching programs (SROLBPs). Their main result gives explicit constructions of directional affine extractors for entropy k > 2n/3, which implies average-case complexity 2^{n/3-o(n)} against SROLBPs with exponentially small correlation. A follow-up work by Chattopadhyay and Liao (CCC' 23) improves the hardness to 2^{n-o(n)} at the price of increasing the correlation to polynomially large, via a new connection to sumset extractors introduced by Chattopadhyay and Li (STOC' 16) and explicit constructions of such extractors by Chattopadhyay and Liao (STOC' 22). Both works left open the questions of better constructions of directional affine extractors and improved average-case complexity against SROLBPs in the regime of small correlation. This paper provides a much more in-depth study of directional affine extractors, SROLBPs, and ROBPs. Our main results include: - An explicit construction of directional affine extractors with k = o(n) and exponentially small error, which gives average-case complexity 2^{n-o(n)} against SROLBPs with exponentially small correlation, thus answering the two open questions raised in previous works. - An explicit function in AC⁰ that gives average-case complexity 2^{(1-δ)n} against ROBPs with negligible correlation, for any constant δ > 0. Previously, no such average-case hardness is known, and the best size lower bound for any function in AC⁰ against ROBPs is 2^Ω(n). One of the key ingredients in our constructions is a new linear somewhere condenser for affine sources, which is based on dimension expanders. The condenser also leads to an unconditional improvement of the entropy requirement of explicit affine extractors with negligible error. We further show that the condenser also works for general weak random sources, under the Polynomial Freiman-Ruzsa Theorem in 𝖥₂ⁿ, recently proved by Gowers, Green, Manners, and Tao (arXiv' 23).

Cite as

Xin Li and Yan Zhong. Explicit Directional Affine Extractors and Improved Hardness for Linear Branching Programs. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 10:1-10:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.CCC.2024.10,
  author =	{Li, Xin and Zhong, Yan},
  title =	{{Explicit Directional Affine Extractors and Improved Hardness for Linear Branching Programs}},
  booktitle =	{39th Computational Complexity Conference (CCC 2024)},
  pages =	{10:1--10:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-331-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{300},
  editor =	{Santhanam, Rahul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.10},
  URN =		{urn:nbn:de:0030-drops-204060},
  doi =		{10.4230/LIPIcs.CCC.2024.10},
  annote =	{Keywords: Randomness Extractors, Affine, Read-once Linear Branching Programs, Low-degree polynomials, AC⁰ circuits}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2024.108

Two-Source and Affine Non-Malleable Extractors for Small Entropy

Authors: Xin Li and Yan Zhong

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

Non-malleable extractors are generalizations and strengthening of standard randomness extractors, that are resilient to adversarial tampering. Such extractors have wide applications in cryptography and have become important cornerstones in recent breakthroughs of explicit constructions of two-source extractors and affine extractors for small entropy. However, explicit constructions of non-malleable extractors appear to be much harder than standard extractors. Indeed, in the well-studied models of two-source and affine non-malleable extractors, the previous best constructions only work for entropy rate > 2/3 and 1-γ for some small constant γ > 0 respectively by Li (FOCS' 23). In this paper, we present explicit constructions of two-source and affine non-malleable extractors that match the state-of-the-art constructions of standard ones for small entropy. Our main results include: - Two-source and affine non-malleable extractors (over 𝖥₂) for sources on n bits with min-entropy k ≥ log^C n and polynomially small error, matching the parameters of standard extractors by Chattopadhyay and Zuckerman (STOC' 16, Annals of Mathematics' 19) and Li (FOCS' 16). - Two-source and affine non-malleable extractors (over 𝖥₂) for sources on n bits with min-entropy k = O(log n) and constant error, matching the parameters of standard extractors by Li (FOCS' 23). Our constructions significantly improve previous results, and the parameters (entropy requirement and error) are the best possible without first improving the constructions of standard extractors. In addition, our improved affine non-malleable extractors give strong lower bounds for a certain kind of read-once linear branching programs, recently introduced by Gryaznov, Pudlák, and Talebanfard (CCC' 22) as a generalization of several well studied computational models. These bounds match the previously best-known average-case hardness results given by Chattopadhyay and Liao (CCC' 23) and Li (FOCS' 23), where the branching program size lower bounds are close to optimal, but the explicit functions we use here are different. Our results also suggest a possible deeper connection between non-malleable extractors and standard ones.

Cite as

Xin Li and Yan Zhong. Two-Source and Affine Non-Malleable Extractors for Small Entropy. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 108:1-108:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.ICALP.2024.108,
  author =	{Li, Xin and Zhong, Yan},
  title =	{{Two-Source and Affine Non-Malleable Extractors for Small Entropy}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{108:1--108:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.108},
  URN =		{urn:nbn:de:0030-drops-202512},
  doi =		{10.4230/LIPIcs.ICALP.2024.108},
  annote =	{Keywords: Randomness Extractors, Non-malleable, Two-source, Affine}
}

Document

DOI: 10.4230/LIPIcs.ESA.2023.40

Efficient Parallel Output-Sensitive Edit Distance

Authors: Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Abstract

In this paper, we study efficient parallel edit distance algorithms, both in theory and in practice. Given two strings A[1..n] and B[1..m], and a set of operations allowed to edit the strings, the edit distance between A and B is the minimum number of operations required to transform A into B. In this paper, we use edit distance to refer to the Levenshtein distance, which allows for unit-cost single-character edits (insertions, deletions, substitutions). Sequentially, a standard Dynamic Programming (DP) algorithm solves edit distance with Θ(nm) cost. In many real-world applications, the strings to be compared are similar to each other and have small edit distances. To achieve highly practical implementations, we focus on output-sensitive parallel edit-distance algorithms, i.e., to achieve asymptotically better cost bounds than the standard Θ(nm) algorithm when the edit distance is small. We study four algorithms in the paper, including three algorithms based on Breadth-First Search (BFS), and one algorithm based on Divide-and-Conquer (DaC). Our BFS-based solution is based on the Landau-Vishkin algorithm. We implement three different data structures for the longest common prefix (LCP) queries needed in the algorithm: the classic solution using parallel suffix array, and two hash-based solutions proposed in this paper. Our DaC-based solution is inspired by the output-insensitive solution proposed by Apostolico et al., and we propose a non-trivial adaption to make it output-sensitive. All of the algorithms studied in this paper have good theoretical guarantees, and they achieve different tradeoffs between work (total number of operations), span (longest dependence chain in the computation), and space. We test and compare our algorithms on both synthetic data and real-world data, including DNA sequences, Wikipedia texts, GitHub repositories, etc. Our BFS-based algorithms outperform the existing parallel edit-distance implementation in ParlayLib in all test cases. On cases with fewer than 10⁵ edits, our algorithm can process input sequences of size 10⁹ in about ten seconds, while ParlayLib can only process sequences of sizes up to 10⁶ in the same amount of time. By comparing our algorithms, we also provide a better understanding of the choice of algorithms for different input patterns. We believe that our paper is the first systematic study in the theory and practice of parallel edit distance.

Cite as

Xiangyun Ding, Xiaojun Dong, Yan Gu, Youzhe Liu, and Yihan Sun. Efficient Parallel Output-Sensitive Edit Distance. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 40:1-40:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{ding_et_al:LIPIcs.ESA.2023.40,
  author =	{Ding, Xiangyun and Dong, Xiaojun and Gu, Yan and Liu, Youzhe and Sun, Yihan},
  title =	{{Efficient Parallel Output-Sensitive Edit Distance}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{40:1--40:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.40},
  URN =		{urn:nbn:de:0030-drops-186935},
  doi =		{10.4230/LIPIcs.ESA.2023.40},
  annote =	{Keywords: Edit Distance, Parallel Algorithms, String Algorithms, Dynamic Programming, Pattern Matching}
}

Document

DOI: 10.4230/LIPIcs.GISCIENCE.2018.10

Local Co-location Pattern Detection: A Summary of Results

Authors: Yan Li and Shashi Shekhar

Published in: LIPIcs, Volume 114, 10th International Conference on Geographic Information Science (GIScience 2018)

Abstract

Given a set of spatial objects of different features (e.g., mall, hospital) and a spatial relation (e.g., geographic proximity), the problem of local co-location pattern detection (LCPD) pairs co-location patterns and localities such that the co-location patterns tend to exist inside the paired localities. A co-location pattern is a set of spatial features, the objects of which are often related to each other. Local co-location patterns are common in many fields, such as public security, and public health. For example, assault crimes and drunk driving events co-locate near bars. The problem is computationally challenging because of the exponential number of potential co-location patterns and candidate localities. The related work applies data-unaware or clustering heuristics to partition the study area, which results in incomplete enumeration of possible localities. In this study, we formally defined the LCPD problem where the candidate locality was defined using minimum orthogonal bounding rectangles (MOBRs). Then, we proposed a Quadruplet & Grid Filter-Refine (QGFR) algorithm that leveraged an MOBR enumeration lemma, and a novel upper bound on the participation index to efficiently prune the search space. The experimental evaluation showed that the QGFR algorithm reduced the computation cost substantially. One case study using the North American Atlas-Hydrography and U.S. Major City Datasets was conducted to discover local co-location patterns which would be missed if the entire dataset was analyzed or methods proposed by the related work were applied.

Cite as

Yan Li and Shashi Shekhar. Local Co-location Pattern Detection: A Summary of Results. In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 10:1-10:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.GISCIENCE.2018.10,
  author =	{Li, Yan and Shekhar, Shashi},
  title =	{{Local Co-location Pattern Detection: A Summary of Results}},
  booktitle =	{10th International Conference on Geographic Information Science (GIScience 2018)},
  pages =	{10:1--10:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-083-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{114},
  editor =	{Winter, Stephan and Griffin, Amy and Sester, Monika},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GISCIENCE.2018.10},
  URN =		{urn:nbn:de:0030-drops-93387},
  doi =		{10.4230/LIPIcs.GISCIENCE.2018.10},
  annote =	{Keywords: Co-location pattern, Participation index, Spatial heterogeneity}
}

Document

Short Paper

DOI: 10.4230/LIPIcs.GISCIENCE.2018.40

Estimating Building Age from Google Street View Images Using Deep Learning (Short Paper)

Authors: Yan Li, Yiqun Chen, Abbas Rajabifard, Kourosh Khoshelham, and Mitko Aleksandrov

Published in: LIPIcs, Volume 114, 10th International Conference on Geographic Information Science (GIScience 2018)

Abstract

Building databases are a fundamental component of urban analysis. However such databases usually lack detailed attributes such as building age. With a large volume of building images being accessible online via API (such as Google Street View), as well as the fast development of image processing techniques such as deep learning, it becomes feasible to extract information from images to enrich building databases. This paper proposes a novel method to estimate building age based on the convolutional neural network for image features extraction and support vector machine for construction year regression. The contributions of this paper are two-fold: First, to our knowledge, this is the first attempt for estimating building age from images by using deep learning techniques. It provides new insight for planners to apply image processing and deep learning techniques for building database enrichment. Second, an image-base building age estimation framework is proposed which doesn't require information on building height, floor area, construction materials and therefore makes the analysis process simpler and more efficient.

Cite as

Yan Li, Yiqun Chen, Abbas Rajabifard, Kourosh Khoshelham, and Mitko Aleksandrov. Estimating Building Age from Google Street View Images Using Deep Learning (Short Paper). In 10th International Conference on Geographic Information Science (GIScience 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 114, pp. 40:1-40:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.GISCIENCE.2018.40,
  author =	{Li, Yan and Chen, Yiqun and Rajabifard, Abbas and Khoshelham, Kourosh and Aleksandrov, Mitko},
  title =	{{Estimating Building Age from Google Street View Images Using Deep Learning}},
  booktitle =	{10th International Conference on Geographic Information Science (GIScience 2018)},
  pages =	{40:1--40:7},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-083-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{114},
  editor =	{Winter, Stephan and Griffin, Amy and Sester, Monika},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GISCIENCE.2018.40},
  URN =		{urn:nbn:de:0030-drops-93682},
  doi =		{10.4230/LIPIcs.GISCIENCE.2018.40},
  annote =	{Keywords: Building database, deep learning, CNN, SVM, Google Street View}
}

Document

DOI: 10.4230/DagSemProc.07271.14

Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens

Authors: Yan Chen, Maxwell Harper, Joseph Konstan, and Sherry Li

Published in: Dagstuhl Seminar Proceedings, Volume 7271, Computational Social Systems and the Internet (2007)

Abstract

We explore the use of social comparison theory as a natural mechanism to increase contributions to an online movie recommendation community by investigating the effects of social information on user behavior in an online field experiment. We find that, after receiving behavioral information about the median user's total number of movie ratings, users below the median demonstrate a 530% increase in the number of monthly movie ratings, while those above the median decrease their monthly ratings by 62%. Movements from both ends converge towards the median, indicating conformity towards a newly-established social norm in a community where such a norm had been absent. Furthermore, the social information has a more dramatic effect on those below the median, suggesting an interaction between conformity and competitive preferences. When given outcome information about the average user's net benefit score from the system, consistent with social preference theory, users with net benefit scores above average contribute 94% of the new updates in the database. In both treatments, we find a highly significant Red Queen Effect.

Cite as

Yan Chen, Maxwell Harper, Joseph Konstan, and Sherry Li. Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens. In Computational Social Systems and the Internet. Dagstuhl Seminar Proceedings, Volume 7271, pp. 1-7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:DagSemProc.07271.14,
  author =	{Chen, Yan and Harper, Maxwell and Konstan, Joseph and Li, Sherry},
  title =	{{Social Comparisons and Contributions to Online Communities: A Field Experiment on MovieLens}},
  booktitle =	{Computational Social Systems and the Internet},
  pages =	{1--7},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7271},
  editor =	{Peter Cramton and Rudolf M\"{u}ller and Eva Tardos and Moshe Tennenholtz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.07271.14},
  URN =		{urn:nbn:de:0030-drops-11550},
  doi =		{10.4230/DagSemProc.07271.14},
  annote =	{Keywords: Social comparison, conformity, public goods, embedded online field experiment}
}

14 Search Results for "Li, Yan"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message