30 Search Results for "Joseph, Michael"


Document
Storylines with a Protagonist

Authors: Tim Hegemann and Alexander Wolff

Published in: LIPIcs, Volume 320, 32nd International Symposium on Graph Drawing and Network Visualization (GD 2024)


Abstract
Storyline visualizations show interactions between a given set of characters over time. Each character is represented by an x-monotone curve. A meeting is represented by a vertical bar that is crossed by the curves of exactly those characters that participate in the meeting. Therefore, character curves may have to cross each other. In the context of publication networks, we consider storylines where the characters are authors and the meetings are joint publications. We are especially interested in visualizing a group of colleagues centered around an author, the protagonist, who participates in all selected publications. For such instances, we propose a drawing style where the protagonist’s curve is drawn at a prominent position and never crossed by any other author’s curve. We consider two variants of storylines with a protagonist. In the one-sided variant, the protagonist is required to be drawn at the top position. In this restricted setting, we can efficiently compute a drawing with the minimum number of pairwise crossings, whereas we show that it is NP-hard to minimize the number of block crossings (i.e., pairs of blocks of parallel curves that intersect each other). In the two-sided variant, the task is to split the set of co-authors of the protagonist into two groups, and to place the curves of one group above and the curves of the other group below the protagonist’s curve such that the total number of (block) crossings is minimized. As our main result, we present an algorithm for bundling a sequence of pairwise crossings into a sequence of few block crossings (in the absence of meetings). It exploits a connection to a rectangle dissection problem. In the presence of meetings, it yields results that are very close to a lower bound. Based on this bundling algorithm and our exact algorithm for the one-sided variant, we present a new heuristic for computing two-sided storylines with few block crossings. We perform an extensive experimental study using publication data of 81 protagonists from GD 2023 and their most frequent collaborators over the last ten years. Our study shows that, for two-sided storylines with a protagonist, our new heuristic uses fewer block crossings (and fewer pairwise crossings) than two heuristics for block crossing minimization in general storylines.

Cite as

Tim Hegemann and Alexander Wolff. Storylines with a Protagonist. In 32nd International Symposium on Graph Drawing and Network Visualization (GD 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 320, pp. 26:1-26:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hegemann_et_al:LIPIcs.GD.2024.26,
  author =	{Hegemann, Tim and Wolff, Alexander},
  title =	{{Storylines with a Protagonist}},
  booktitle =	{32nd International Symposium on Graph Drawing and Network Visualization (GD 2024)},
  pages =	{26:1--26:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-343-0},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{320},
  editor =	{Felsner, Stefan and Klein, Karsten},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GD.2024.26},
  URN =		{urn:nbn:de:0030-drops-213109},
  doi =		{10.4230/LIPIcs.GD.2024.26},
  annote =	{Keywords: Storyline visualization, storyline with a protagonist, crossing minimization, block crossings}
}
Document
Optimizing Layout of Recursive Datatypes with Marmoset: Or, Algorithms + Data Layouts = Efficient Programs

Authors: Vidush Singhal, Chaitanya Koparkar, Joseph Zullo, Artem Pelenitsyn, Michael Vollmer, Mike Rainey, Ryan Newton, and Milind Kulkarni

Published in: LIPIcs, Volume 313, 38th European Conference on Object-Oriented Programming (ECOOP 2024)


Abstract
While programmers know that memory representation of data structures can have significant effects on performance, compiler support to optimize the layout of those structures is an under-explored field. Prior work has optimized the layout of individual, non-recursive structures without considering how collections of those objects in linked or recursive data structures are laid out. This work introduces Marmoset, a compiler that optimizes the layouts of algebraic datatypes, with a special focus on producing highly optimized, packed data layouts where recursive structures can be traversed with minimal pointer chasing. Marmoset performs an analysis of how a recursive ADT is used across functions to choose a global layout that promotes simple, strided access for that ADT in memory. It does so by building and solving a constraint system to minimize an abstract cost model, yielding a predicted efficient layout for the ADT. Marmoset then builds on top of Gibbon, a prior compiler for packed, mostly-serial representations, to synthesize optimized ADTs. We show experimentally that Marmoset is able to choose optimal layouts across a series of microbenchmarks and case studies, outperforming both Gibbon’s baseline approach, as well as MLton, a Standard ML compiler that uses traditional pointer-heavy representations.

Cite as

Vidush Singhal, Chaitanya Koparkar, Joseph Zullo, Artem Pelenitsyn, Michael Vollmer, Mike Rainey, Ryan Newton, and Milind Kulkarni. Optimizing Layout of Recursive Datatypes with Marmoset: Or, Algorithms + Data Layouts = Efficient Programs. In 38th European Conference on Object-Oriented Programming (ECOOP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 313, pp. 38:1-38:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{singhal_et_al:LIPIcs.ECOOP.2024.38,
  author =	{Singhal, Vidush and Koparkar, Chaitanya and Zullo, Joseph and Pelenitsyn, Artem and Vollmer, Michael and Rainey, Mike and Newton, Ryan and Kulkarni, Milind},
  title =	{{Optimizing Layout of Recursive Datatypes with Marmoset: Or, Algorithms + Data Layouts = Efficient Programs}},
  booktitle =	{38th European Conference on Object-Oriented Programming (ECOOP 2024)},
  pages =	{38:1--38:28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-341-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{313},
  editor =	{Aldrich, Jonathan and Salvaneschi, Guido},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2024.38},
  URN =		{urn:nbn:de:0030-drops-208875},
  doi =		{10.4230/LIPIcs.ECOOP.2024.38},
  annote =	{Keywords: Tree traversals, Compilers, Data layout optimization, Dense data layout}
}
Document
Artifact
Optimizing Layout of Recursive Datatypes with Marmoset (Artifact)

Authors: Vidush Singhal, Chaitanya Koparkar, Joseph Zullo, Artem Pelenitsyn, Michael Vollmer, Mike Rainey, Ryan Newton, and Milind Kulkarni

Published in: DARTS, Volume 10, Issue 2, Special Issue of the 38th European Conference on Object-Oriented Programming (ECOOP 2024)


Abstract
While programmers know that memory representation of data structures can have significant effects on performance, compiler support to optimize the layout of those structures is an under-explored field. Prior work has optimized the layout of individual, non-recursive structures without considering how collections of those objects in linked or recursive data structures are laid out. This work introduces Marmoset, a compiler that optimizes the layouts of algebraic datatypes, with a special focus on producing highly optimized, packed data layouts where recursive structures can be traversed with minimal pointer chasing. Marmoset performs an analysis of how a recursive ADT is used across functions to choose a global layout that promotes simple, strided access for that ADT in memory. It does so by building and solving a constraint system to minimize an abstract cost model, yielding a predicted efficient layout for the ADT. Marmoset then builds on top of Gibbon, a prior compiler for packed, mostly-serial representations, to synthesize optimized ADTs. We show experimentally that Marmoset is able to choose optimal layouts across a series of microbenchmarks and case studies, outperforming both Gibbon’s baseline approach, as well as MLton, a Standard ML compiler that uses traditional pointer-heavy representations.

Cite as

Vidush Singhal, Chaitanya Koparkar, Joseph Zullo, Artem Pelenitsyn, Michael Vollmer, Mike Rainey, Ryan Newton, and Milind Kulkarni. Optimizing Layout of Recursive Datatypes with Marmoset (Artifact). In Special Issue of the 38th European Conference on Object-Oriented Programming (ECOOP 2024). Dagstuhl Artifacts Series (DARTS), Volume 10, Issue 2, pp. 21:1-21:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{singhal_et_al:DARTS.10.2.21,
  author =	{Singhal, Vidush and Koparkar, Chaitanya and Zullo, Joseph and Pelenitsyn, Artem and Vollmer, Michael and Rainey, Mike and Newton, Ryan and Kulkarni, Milind},
  title =	{{Optimizing Layout of Recursive Datatypes with Marmoset (Artifact)}},
  pages =	{21:1--21:10},
  journal =	{Dagstuhl Artifacts Series},
  ISBN =	{978-3-95977-342-3},
  ISSN =	{2509-8195},
  year =	{2024},
  volume =	{10},
  number =	{2},
  editor =	{Singhal, Vidush and Koparkar, Chaitanya and Zullo, Joseph and Pelenitsyn, Artem and Vollmer, Michael and Rainey, Mike and Newton, Ryan and Kulkarni, Milind},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DARTS.10.2.21},
  URN =		{urn:nbn:de:0030-drops-209199},
  doi =		{10.4230/DARTS.10.2.21},
  annote =	{Keywords: Tree traversals, Compilers, Data layout optimization, Dense data layout}
}
Document
HOBBIT: Hashed OBject Based InTegrity

Authors: Matthias Bernad and Stefan Brunthaler

Published in: LIPIcs, Volume 313, 38th European Conference on Object-Oriented Programming (ECOOP 2024)


Abstract
C vulnerabilities usually hold verbatim for C++ programs. The counterfeit-object-oriented programming attack demonstrated that this relation is asymmetric, i.e., it only applies to C++. The problem pinpointed by this COOP attack is that C++ does not validate the integrity of its objects. By injecting malicious objects with manipulated virtual function table pointers, attackers can hijack control-flow of programs. The software security community addressed the COOP-problem in the years following its discovery, but together with the emergence of transient-execution attacks, such as Spectre, researchers also shifted their attention. We present Hobbit, a software-only solution to prevent COOP attacks by validating object integrity for virtual function pointer tables. Hobbit does not require any hardware specific features, scales to multi-million lines of C++ source code, and our LLVM-based implementation offers a configurable performance impact between 121.63% and 2.80% on compute-intensive SPEC CPU C++ benchmarks. Hobbit’s security analysis indicates strong resistance to brute forcing attacks and demonstrates additional benefits of using execute-only memory.

Cite as

Matthias Bernad and Stefan Brunthaler. HOBBIT: Hashed OBject Based InTegrity. In 38th European Conference on Object-Oriented Programming (ECOOP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 313, pp. 7:1-7:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bernad_et_al:LIPIcs.ECOOP.2024.7,
  author =	{Bernad, Matthias and Brunthaler, Stefan},
  title =	{{HOBBIT: Hashed OBject Based InTegrity}},
  booktitle =	{38th European Conference on Object-Oriented Programming (ECOOP 2024)},
  pages =	{7:1--7:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-341-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{313},
  editor =	{Aldrich, Jonathan and Salvaneschi, Guido},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2024.7},
  URN =		{urn:nbn:de:0030-drops-208566},
  doi =		{10.4230/LIPIcs.ECOOP.2024.7},
  annote =	{Keywords: software security, code-reuse attacks, language-based security, counterfeit-object-oriented programming, object integrity, compiler security}
}
Document
Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops

Authors: Antti Elonen and Pekka Orponen

Published in: LIPIcs, Volume 314, 30th International Conference on DNA Computing and Molecular Programming (DNA 30) (2024)


Abstract
We present a general design technique for rendering any 3D wireframe model, that is any connected graph linearly embedded in 3D space, as an RNA origami nanostructure with a minimum number of kissing loops. The design algorithm, which applies some ideas and methods from topological graph theory, produces renderings that contain at most one kissing-loop pair for many interesting model families, including for instance all fully triangulated wireframes and the wireframes of all Platonic solids. The design method is already implemented and available for use in the design tool DNAforge (https://dnaforge.org).

Cite as

Antti Elonen and Pekka Orponen. Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops. In 30th International Conference on DNA Computing and Molecular Programming (DNA 30). Leibniz International Proceedings in Informatics (LIPIcs), Volume 314, pp. 4:1-4:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{elonen_et_al:LIPIcs.DNA.30.4,
  author =	{Elonen, Antti and Orponen, Pekka},
  title =	{{Designing 3D RNA Origami Nanostructures with a Minimum Number of Kissing Loops}},
  booktitle =	{30th International Conference on DNA Computing and Molecular Programming (DNA 30)},
  pages =	{4:1--4:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-344-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{314},
  editor =	{Seki, Shinnosuke and Stewart, Jaimie Marie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DNA.30.4},
  URN =		{urn:nbn:de:0030-drops-209325},
  doi =		{10.4230/LIPIcs.DNA.30.4},
  annote =	{Keywords: RNA origami, wireframe nanostructures, polyhedra, kissing loops, topological graph embeddings, self-assembly}
}
Document
Completeness of Asynchronous Session Tree Subtyping in Coq

Authors: Burak Ekici and Nobuko Yoshida

Published in: LIPIcs, Volume 309, 15th International Conference on Interactive Theorem Proving (ITP 2024)


Abstract
Multiparty session types (MPST) serve as a foundational framework for formally specifying and verifying message passing protocols. Asynchronous subtyping in MPST allows for typing optimised programs preserving type safety and deadlock freedom under asynchronous interactions where the message order is preserved and sending is non-blocking. The optimisation is obtained by message reordering, which allows for sending messages earlier or receiving them later. Sound subtyping algorithms have been extensively studied and implemented as part of various programming languages and tools including C, Rust and C-MPI. However, formalising all such permutations under sequencing, selection, branching and recursion in session types is an intricate task. Additionally, checking asynchronous subtyping has been proven to be undecidable. This paper introduces the first formalisation of asynchronous subtyping in MPST within the Coq proof assistant. We first decompose session types into session trees that do not involve branching and selection, and then establish a coinductive refinement relation over them to govern subtyping. To showcase our formalisation, we prove example subtyping schemas that appear in the literature, all of which cannot be verified, at the same time, by any of the existing decidable sound algorithms. Additionally, we take the (inductive) negation of the refinement relation from a prior work by Ghilezan et al. [Ghilezan et al., 2023] and re-implement it, significantly reducing the number of rules (from eighteen to eight). We establish the completeness of subtyping with respect to its negation in Coq, addressing the issues concerning the negation rules outlined in the previous work [Ghilezan et al., 2023]. In the formalisation, we use the greatest fixed point of the least fixed point technique, facilitated by the paco library, to define coinductive predicates. We employ parametrised coinduction to prove their properties. The formalisation consists of roughly 10K lines of Coq code, accessible at: https://github.com/ekiciburak/sessionTreeST/tree/itp2024.

Cite as

Burak Ekici and Nobuko Yoshida. Completeness of Asynchronous Session Tree Subtyping in Coq. In 15th International Conference on Interactive Theorem Proving (ITP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 309, pp. 13:1-13:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{ekici_et_al:LIPIcs.ITP.2024.13,
  author =	{Ekici, Burak and Yoshida, Nobuko},
  title =	{{Completeness of Asynchronous Session Tree Subtyping in Coq}},
  booktitle =	{15th International Conference on Interactive Theorem Proving (ITP 2024)},
  pages =	{13:1--13:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-337-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{309},
  editor =	{Bertot, Yves and Kutsia, Temur and Norrish, Michael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2024.13},
  URN =		{urn:nbn:de:0030-drops-207418},
  doi =		{10.4230/LIPIcs.ITP.2024.13},
  annote =	{Keywords: asynchronous multiparty session types, session trees, subtyping, Coq}
}
Document
Constraint Modelling with LLMs Using In-Context Learning

Authors: Kostis Michailidis, Dimos Tsouros, and Tias Guns

Published in: LIPIcs, Volume 307, 30th International Conference on Principles and Practice of Constraint Programming (CP 2024)


Abstract
Constraint Programming (CP) allows for the modelling and solving of a wide range of combinatorial problems. However, modelling such problems using constraints over decision variables still requires significant expertise, both in conceptual thinking and syntactic use of modelling languages. In this work, we explore the potential of using pre-trained Large Language Models (LLMs) as coding assistants, to transform textual problem descriptions into concrete and executable CP specifications. We present different transformation pipelines with explicit intermediate representations, and we investigate the potential benefit of various retrieval-augmented example selection strategies for in-context learning. We evaluate our approach on 2 datasets from the literature, namely NL4Opt (optimisation) and Logic Grid Puzzles (satisfaction), and a heterogeneous set of exercises from a CP course. The results show that pre-trained LLMs have promising potential for initialising the modelling process, with retrieval-augmented in-context learning significantly enhancing their modelling capabilities.

Cite as

Kostis Michailidis, Dimos Tsouros, and Tias Guns. Constraint Modelling with LLMs Using In-Context Learning. In 30th International Conference on Principles and Practice of Constraint Programming (CP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 307, pp. 20:1-20:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{michailidis_et_al:LIPIcs.CP.2024.20,
  author =	{Michailidis, Kostis and Tsouros, Dimos and Guns, Tias},
  title =	{{Constraint Modelling with LLMs Using In-Context Learning}},
  booktitle =	{30th International Conference on Principles and Practice of Constraint Programming (CP 2024)},
  pages =	{20:1--20:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-336-2},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{307},
  editor =	{Shaw, Paul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2024.20},
  URN =		{urn:nbn:de:0030-drops-207053},
  doi =		{10.4230/LIPIcs.CP.2024.20},
  annote =	{Keywords: Constraint Modelling, Constraint Acquisition, Constraint Programming, Large Language Models, In-Context Learning, Natural Language Processing, Named Entity Recognition, Retrieval-Augmented Generation, Optimisation}
}
Document
Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests

Authors: Lukas Hübner and Alexandros Stamatakis

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
The field of population genetics attempts to advance our understanding of evolutionary processes. It has applications, for example, in medical research, wildlife conservation, and - in conjunction with recent advances in ancient DNA sequencing technology - studying human migration patterns over the past few thousand years. The basic toolbox of population genetics includes genealogical trees, which describe the shared evolutionary history among individuals of the same species. They are calculated on the basis of genetic variations. However, in recombining organisms, a single tree is insufficient to describe the evolutionary history of the whole genome. Instead, a collection of correlated trees can be used, where each describes the evolutionary history of a consecutive region of the genome. The current corresponding state of-the-art data structure, tree sequences, compresses these genealogical trees via edit operations when moving from one tree to the next along the genome instead of storing the full, often redundant, description for each tree. We propose a new data structure, genealogical forests, which compresses the set of genealogical trees into a DAG. In this DAG identical subtrees that are shared across the input trees are encoded only once, thereby allowing for straight-forward memoization of intermediate results. Additionally, we provide a C++ implementation of our proposed data structure, called gfkit, which is 2.1 to 11.2 (median 4.0) times faster than the state-of-the-art tool on empirical and simulated datasets at computing important population genetics statistics such as the Allele Frequency Spectrum, Patterson’s f, the Fixation Index, Tajima’s D, pairwise Lowest Common Ancestors, and others. On Lowest Common Ancestor queries with more than two samples as input, gfkit scales asymptotically better than the state-of-the-art, and is thus up to 990 times faster. In conclusion, our proposed data structure compresses genealogical trees by storing shared subtrees only once, thereby enabling straight-forward memoization of intermediate results, yielding a substantial runtime reduction and a potentially more intuitive data representation over the state-of-the-art. Our improvements will boost the development of novel analyses and models in the field of population genetics and increases scalability to ever-growing genomic datasets.

Cite as

Lukas Hübner and Alexandros Stamatakis. Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 5:1-5:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hubner_et_al:LIPIcs.WABI.2024.5,
  author =	{H\"{u}bner, Lukas and Stamatakis, Alexandros},
  title =	{{Memoization on Shared Subtrees Accelerates Computations on Genealogical Forests}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{5:1--5:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.5},
  URN =		{urn:nbn:de:0030-drops-206499},
  doi =		{10.4230/LIPIcs.WABI.2024.5},
  annote =	{Keywords: bioinformatics, population genetics, algorithms}
}
Document
PLA-index: A k-mer Index Exploiting Rank Curve Linearity

Authors: Md. Hasin Abrar and Paul Medvedev

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
Given a sorted list of k-mers S, the rank curve of S is the function mapping a k-mer from the k-mer universe to the location in S where it either first appears or would be inserted. An exciting recent development is the observation that, for certain datasets, the rank curve is predictable and can be exploited to create small search indices. In this paper, we develop a novel search index that first estimates a k-mer’s rank using a piece-wise linear approximation of the rank curve and then does a local search to determine the precise location of the k-mer in the list. We combine ideas from previous approaches and supplement them with an innovative data representation strategy that substantially reduces space usage. Our PLA-index uses an order of magnitude less space than Sapling and uses less than half the space of the PGM-index, for roughly the same query time. For example, using only 9 MiB of memory, it can narrow down the position of k-mer in the suffix array of the human genome to within 255 positions. Furthermore, we demonstrate the potential of our approach to impact a variety of downstream applications. First, the PLA-index halves the time of binary search on the suffix array of the human genome. Second, the PLA-index reduces the space of a direct-access lookup table by 76 percent, without increasing the run time. Third, we plug the PLA-index into a state-of-the-art read aligner Strobealign and replace a 2 GiB component with a PLA-index of size 1.5 MiB, without significantly effecting runtime. The software and reproducibility information is freely available at https://github.com/medvedevgroup/pla-index.

Cite as

Md. Hasin Abrar and Paul Medvedev. PLA-index: A k-mer Index Exploiting Rank Curve Linearity. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 13:1-13:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{abrar_et_al:LIPIcs.WABI.2024.13,
  author =	{Abrar, Md. Hasin and Medvedev, Paul},
  title =	{{PLA-index: A k-mer Index Exploiting Rank Curve Linearity}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{13:1--13:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.13},
  URN =		{urn:nbn:de:0030-drops-206578},
  doi =		{10.4230/LIPIcs.WABI.2024.13},
  annote =	{Keywords: K-mer index, Piece-wise linear approximation, Learned index}
}
Document
AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction

Authors: Adam Cicherski, Anna Lisiecka, and Norbert Dojer

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
The success of pangenome-based approaches to genomics analysis depends largely on the existence of efficient methods for constructing pangenome graphs that are applicable to large genome collections. In the current paper we present AlfaPang, a new pangenome graph building algorithm. AlfaPang is based on a novel alignment-free approach that allows to construct pangenome graphs using significantly less computational resources than state-of-the-art tools. The code of AlfaPang is freely available at https://github.com/AdamCicherski/AlfaPang.

Cite as

Adam Cicherski, Anna Lisiecka, and Norbert Dojer. AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 23:1-23:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{cicherski_et_al:LIPIcs.WABI.2024.23,
  author =	{Cicherski, Adam and Lisiecka, Anna and Dojer, Norbert},
  title =	{{AlfaPang: Alignment Free Algorithm for Pangenome Graph Construction}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{23:1--23:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.23},
  URN =		{urn:nbn:de:0030-drops-206673},
  doi =		{10.4230/LIPIcs.WABI.2024.23},
  annote =	{Keywords: pangenome, variation graph, genome alignment, population genomics}
}
Document
A*PA2: Up to 19× Faster Exact Global Alignment

Authors: Ragnar Groot Koerkamp

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
Motivation. Pairwise alignment is at the core of computational biology. Most commonly used exact methods are either based on O(ns) band doubling or O(n+s²) diagonal transition, where n is the sequence length and s the number of errors. However, as the length of sequences has grown, these exact methods are often replaced by approximate methods based on e.g. seed-and-extend and heuristics to bound the computed region. We would like to develop an exact method that matches the performance of these approximate methods. Recently, Astarix introduced the A* shortest path algorithm with the seed heuristic for exact sequence-to-graph alignment. A*PA adapted and improved this for pairwise sequence alignment and achieves near-linear runtime when divergence (error rate) is low, at the cost of being very slow when divergence is high. Methods. We introduce A*PA2, an exact global pairwise aligner with respect to edit distance. The goal of A*PA2 is to unify the near-linear runtime of A*PA on similar sequences with the efficiency of dynamic programming (DP) based methods. Like Edlib, A*PA2 uses Ukkonen’s band doubling in combination with Myers' bitpacking. A*PA2 1) uses large block sizes inspired by Block Aligner, 2) extends this with SIMD (single instruction, multiple data), 3) introduces a new profile for efficient computations, 4) introduces a new optimistic technique for traceback based on diagonal transition, 5) avoids recomputation of states where possible, and 6) applies the heuristics developed in A*PA and improves them using pre-pruning. Results. With the first 4 engineering optimizations, A*PA2-simple has complexity O(ns) and is 6× to 8× faster than Edlib for sequences ≥ 10 kbp. A*PA2-full also includes the heuristic and is often near-linear in practice for sequences with small divergence. The average runtime of A*PA2 is 19× faster than the exact aligners BiWFA and Edlib on >500 kbp long ONT (Oxford Nanopore Technologies) reads of a human genome having 6% divergence on average. On shorter ONT reads of 11% average divergence the speedup is 5.6× (avg. length 11 kbp) and 0.81× (avg. length 800 bp). On all tested datasets, A*PA2 is competitive with or faster than approximate methods.

Cite as

Ragnar Groot Koerkamp. A*PA2: Up to 19× Faster Exact Global Alignment. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{grootkoerkamp:LIPIcs.WABI.2024.17,
  author =	{Groot Koerkamp, Ragnar},
  title =	{{A*PA2: Up to 19× Faster Exact Global Alignment}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.17},
  URN =		{urn:nbn:de:0030-drops-206610},
  doi =		{10.4230/LIPIcs.WABI.2024.17},
  annote =	{Keywords: Edit distance, Pairwise alignment, A*, Shortest path, Dynamic programming}
}
Document
RNA Inverse Folding Can Be Solved in Linear Time for Structures Without Isolated Stacks or Base Pairs

Authors: Théo Boury, Laurent Bulteau, and Yann Ponty

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)


Abstract
Inverse folding is a classic instance of negative RNA design which consists in finding a sequence that uniquely folds into a target secondary structure with respect to energy minimization. A breakthrough result of Bonnet et al. shows that, even in simple base pairs-based (BP) models, the decision version of a mildly constrained version of inverse folding is NP-hard. In this work, we show that inverse folding can be solved in linear time for a large collection of targets, including every structure that contains no isolated BP and no isolated stack (or, equivalently, when all helices consist of 3^{+} base pairs). For structures featuring shorter helices, our linear algorithm is no longer guaranteed to produce a solution, but still does so for a large proportion of instances. Our approach introduces a notion of modulo m-separability, generalizing a property pioneered by Hales et al. Separability is a sufficient condition for the existence of a solution to the inverse folding problem. We show that, for any input secondary structure of length n, a modulo m-separated sequence can be produced in time 𝒪(n 2^m) anytime such a sequence exists. Meanwhile, we show that any structure consisting of 3^{+} base pairs is either trivially non-designable, or always admits a modulo-2 separated solution (m = 2). Solution sequences can thus be produced in linear time, and even be uniformly generated within the set of modulo-2 separable sequences.

Cite as

Théo Boury, Laurent Bulteau, and Yann Ponty. RNA Inverse Folding Can Be Solved in Linear Time for Structures Without Isolated Stacks or Base Pairs. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 19:1-19:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{boury_et_al:LIPIcs.WABI.2024.19,
  author =	{Boury, Th\'{e}o and Bulteau, Laurent and Ponty, Yann},
  title =	{{RNA Inverse Folding Can Be Solved in Linear Time for Structures Without Isolated Stacks or Base Pairs}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{19:1--19:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.19},
  URN =		{urn:nbn:de:0030-drops-206632},
  doi =		{10.4230/LIPIcs.WABI.2024.19},
  annote =	{Keywords: RNA structure, String Design, Parameterized Complexity, Uniform Sampling}
}
Document
Generalizing Roberts' Characterization of Unit Interval Graphs

Authors: Virginia Ardévol Martínez, Romeo Rizzi, Abdallah Saffidine, Florian Sikora, and Stéphane Vialette

Published in: LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)


Abstract
For any natural number d, a graph G is a (disjoint) d-interval graph if it is the intersection graph of (disjoint) d-intervals, the union of d (disjoint) intervals on the real line. Two important subclasses of d-interval graphs are unit and balanced d-interval graphs (where every interval has unit length or all the intervals associated to a same vertex have the same length, respectively). A celebrated result by Roberts gives a simple characterization of unit interval graphs being exactly claw-free interval graphs. Here, we study the generalization of this characterization for d-interval graphs. In particular, we prove that for any d ⩾ 2, if G is a K_{1,2d+1}-free interval graph, then G is a unit d-interval graph. However, somehow surprisingly, under the same assumptions, G is not always a disjoint unit d-interval graph. This implies that the class of disjoint unit d-interval graphs is strictly included in the class of unit d-interval graphs. Finally, we study the relationships between the classes obtained under disjoint and non-disjoint d-intervals in the balanced case and show that the classes of disjoint balanced 2-intervals and balanced 2-intervals coincide, but this is no longer true for d > 2.

Cite as

Virginia Ardévol Martínez, Romeo Rizzi, Abdallah Saffidine, Florian Sikora, and Stéphane Vialette. Generalizing Roberts' Characterization of Unit Interval Graphs. In 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 306, pp. 12:1-12:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{ardevolmartinez_et_al:LIPIcs.MFCS.2024.12,
  author =	{Ard\'{e}vol Mart{\'\i}nez, Virginia and Rizzi, Romeo and Saffidine, Abdallah and Sikora, Florian and Vialette, St\'{e}phane},
  title =	{{Generalizing Roberts' Characterization of Unit Interval Graphs}},
  booktitle =	{49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)},
  pages =	{12:1--12:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-335-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{306},
  editor =	{Kr\'{a}lovi\v{c}, Rastislav and Ku\v{c}era, Anton{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2024.12},
  URN =		{urn:nbn:de:0030-drops-205687},
  doi =		{10.4230/LIPIcs.MFCS.2024.12},
  annote =	{Keywords: Interval graphs, Multiple Interval Graphs, Unit Interval Graphs, Characterization}
}
Document
Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness

Authors: Daniel Oliveira, Weifan Chen, Sandro Pinto, and Renato Mancuso

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)


Abstract
Microcontrollers (MCUs) are steadily embracing multi-core technology to meet growing performance demands. This trend marks a shift from their traditionally simple, deterministic designs to more complex and inherently less predictable architectures. While shared resource contention is well-studied in mid to high-end embedded systems, the emergence of multi-core architectures in MCUs introduces unique challenges and characteristics that existing research has not fully explored. In this paper, we conduct an in-depth investigation of both mainstream and next-generation MCU-based platforms, aiming to identify the sources of contention on systems typically lacking these problems. We empirically demonstrate substantial contention effects across different MCU architectures (i.e., from single- to multi-core configurations), highlighting significant application slowdowns. Notably, we observe that slowdowns can reach several orders of magnitude, with the most extreme cases showing up to a 3800x (times, not percent) increase in execution time. To address these issues, we propose and evaluate muTPArtc, a novel mechanism designed for Timely Progress Assessment (TPA) and TPA-based runtime control specifically tailored to MCUs. muTPArtc is an MCU-specialized TPA-based mechanism that leverages hardware facilities widely available in commercial off-the-shelf MCUs (i.e., hardware breakpoints and cycle counters) to successfully monitor applications' progress, detect, and mitigate timing violations. Our results demonstrate that muTPArtc effectively manages performance degradation due to interference, requiring only minimal modifications to the build pipeline and no changes to the source code of the target application, while incurring minor overheads.

Cite as

Daniel Oliveira, Weifan Chen, Sandro Pinto, and Renato Mancuso. Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 5:1-5:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{oliveira_et_al:LIPIcs.ECRTS.2024.5,
  author =	{Oliveira, Daniel and Chen, Weifan and Pinto, Sandro and Mancuso, Renato},
  title =	{{Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{5:1--5:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.5},
  URN =		{urn:nbn:de:0030-drops-203088},
  doi =		{10.4230/LIPIcs.ECRTS.2024.5},
  annote =	{Keywords: multi-core microcontrollers, shared resources contention, progress-aware regulation}
}
Document
Dynamic L-Budget Clustering of Curves

Authors: Kevin Buchin, Maike Buchin, Joachim Gudmundsson, Lukas Plätz, Lea Thiel, and Sampson Wong

Published in: LIPIcs, Volume 294, 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)


Abstract
A key goal of clustering is data reduction. In center-based clustering of complex objects therefore not only the number of clusters but also the complexity of the centers plays a crucial role. We propose L-Budget Clustering as unifying perspective on this task, optimizing the clustering under the constraint that the summed complexity of all centers is at most L. We present algorithms for clustering planar curves under the Fréchet distance, but note that our algorithms more generally apply to objects in metric spaces if a notion of simplification of objects is applicable. A scenario in which data reduction is of particular importance is when the space is limited. Our main result is an efficient (8 + ε)-approximation algorithm with a (1 + ε)-resource augmentation that maintains an L-budget clustering under insertion of curves using only O(Lε^{-1}) space and O^*(L³log(L) + L²log(r^*/r₀)) time where O^* hides factors of ε^{-1}.

Cite as

Kevin Buchin, Maike Buchin, Joachim Gudmundsson, Lukas Plätz, Lea Thiel, and Sampson Wong. Dynamic L-Budget Clustering of Curves. In 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 294, pp. 18:1-18:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{buchin_et_al:LIPIcs.SWAT.2024.18,
  author =	{Buchin, Kevin and Buchin, Maike and Gudmundsson, Joachim and Pl\"{a}tz, Lukas and Thiel, Lea and Wong, Sampson},
  title =	{{Dynamic L-Budget Clustering of Curves}},
  booktitle =	{19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)},
  pages =	{18:1--18:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-318-8},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{294},
  editor =	{Bodlaender, Hans L.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2024.18},
  URN =		{urn:nbn:de:0030-drops-200588},
  doi =		{10.4230/LIPIcs.SWAT.2024.18},
  annote =	{Keywords: clustering, streaming algorithm, polygonal curves, Fr\'{e}chet distance, storage efficiency, simplification, approximation algorithms}
}
  • Refine by Author
  • 2 Gnewuch, Michael
  • 2 Koparkar, Chaitanya
  • 2 Kulkarni, Milind
  • 2 Newton, Ryan
  • 2 Pelenitsyn, Artem
  • Show More...

  • Refine by Classification

  • Refine by Keyword
  • 2 Compilers
  • 2 Coq
  • 2 Data layout optimization
  • 2 Dense data layout
  • 2 Quasi-Monte Carlo
  • Show More...

  • Refine by Type
  • 30 document

  • Refine by Publication Year
  • 15 2024
  • 3 2022
  • 3 2023
  • 2 2009
  • 2 2018
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail