DROPS

Document

DOI: 10.4230/LIPIcs.WABI.2024.17

A*PA2: Up to 19× Faster Exact Global Alignment

Authors: Ragnar Groot Koerkamp

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

Abstract

Motivation. Pairwise alignment is at the core of computational biology. Most commonly used exact methods are either based on O(ns) band doubling or O(n+s²) diagonal transition, where n is the sequence length and s the number of errors. However, as the length of sequences has grown, these exact methods are often replaced by approximate methods based on e.g. seed-and-extend and heuristics to bound the computed region. We would like to develop an exact method that matches the performance of these approximate methods. Recently, Astarix introduced the A* shortest path algorithm with the seed heuristic for exact sequence-to-graph alignment. A*PA adapted and improved this for pairwise sequence alignment and achieves near-linear runtime when divergence (error rate) is low, at the cost of being very slow when divergence is high. Methods. We introduce A*PA2, an exact global pairwise aligner with respect to edit distance. The goal of A*PA2 is to unify the near-linear runtime of A*PA on similar sequences with the efficiency of dynamic programming (DP) based methods. Like Edlib, A*PA2 uses Ukkonen’s band doubling in combination with Myers' bitpacking. A*PA2 1) uses large block sizes inspired by Block Aligner, 2) extends this with SIMD (single instruction, multiple data), 3) introduces a new profile for efficient computations, 4) introduces a new optimistic technique for traceback based on diagonal transition, 5) avoids recomputation of states where possible, and 6) applies the heuristics developed in A*PA and improves them using pre-pruning. Results. With the first 4 engineering optimizations, A*PA2-simple has complexity O(ns) and is 6× to 8× faster than Edlib for sequences ≥ 10 kbp. A*PA2-full also includes the heuristic and is often near-linear in practice for sequences with small divergence. The average runtime of A*PA2 is 19× faster than the exact aligners BiWFA and Edlib on >500 kbp long ONT (Oxford Nanopore Technologies) reads of a human genome having 6% divergence on average. On shorter ONT reads of 11% average divergence the speedup is 5.6× (avg. length 11 kbp) and 0.81× (avg. length 800 bp). On all tested datasets, A*PA2 is competitive with or faster than approximate methods.

Cite as

Ragnar Groot Koerkamp. A*PA2: Up to 19× Faster Exact Global Alignment. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{grootkoerkamp:LIPIcs.WABI.2024.17,
  author =	{Groot Koerkamp, Ragnar},
  title =	{{A*PA2: Up to 19× Faster Exact Global Alignment}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.17},
  URN =		{urn:nbn:de:0030-drops-206610},
  doi =		{10.4230/LIPIcs.WABI.2024.17},
  annote =	{Keywords: Edit distance, Pairwise alignment, A*, Shortest path, Dynamic programming}
}

Document

DOI: 10.4230/DagSemProc.06201.3

Local Minimax Learning of Approximately Polynomial Functions

Authors: Lee Jones and Konstantin Rybnikov

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

Suppose we have a number of noisy measurements of an unknown real-valued function $f$ near point of interest $mathbf{x}_0 in mathbb{R}^d$. Suppose also that nothing can be assumed about the noise distribution, except for zero mean and bounded covariance matrix. We want to estimate $f$ at $mathbf{x=x}_0$ using a general linear parametric family $f(mathbf{x};mathbf{a}) = a_0 h_0 (mathbf{x}) ++ a_q h_q (mathbf{x})$, where $mathbf{a} in mathbb{R}^q$ and $h_i$'s are bounded functions on a neighborhood $B$ of $mathbf{x}_0$ which contains all points of measurement. Typically, $B$ is a Euclidean ball or cube in $mathbb{R}^d$ (more generally, a ball in an $l_p$-norm). In the case when the $h_i$'s are polynomial functions in $x_1,ldots,x_d$ the model is called locally-polynomial. In particular, if the $h_i$'s form a basis of the linear space of polynomials of degree at most two, the model is called locally-quadratic (if the degree is at most three, the model is locally-cubic, etc.). Often, there is information, which is called context, about the function $f$ (restricted to $B$ ) available, such as that it takes values in a known interval, or that it satisfies a Lipschitz condition. The theory of local minimax estimation with context for locally-polynomial models and approximately locally polynomial models has been recently initiated by Jones. In the case of local linearity and a bound on the change of $f$ on $B$, where $B$ is a ball, the solution for squared error loss is in the form of ridge regression, where the ridge parameter is identified; hence, minimax justification for ridge regression is given together with explicit best error bounds. The analysis of polynomial models of degree above 1 leads to interesting and difficult questions in real algebraic geometry and non-linear optimization. We show that in the case when $f$ is a probability function, the optimal (in the minimax sense) estimator is effectively computable (with any given precision), thanks to Tarski's elimination principle.

Cite as

Lee Jones and Konstantin Rybnikov. Local Minimax Learning of Approximately Polynomial Functions. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)

Copy BibTex To Clipboard

@InProceedings{jones_et_al:DagSemProc.06201.3,
  author =	{Jones, Lee and Rybnikov, Konstantin},
  title =	{{Local Minimax Learning of Approximately Polynomial Functions}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--12},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.3},
  URN =		{urn:nbn:de:0030-drops-8912},
  doi =		{10.4230/DagSemProc.06201.3},
  annote =	{Keywords: Local learning, statistical learning, estimator, minimax, convex optimization, quantifier elimination, semialgebraic, ridge regression, polynomial}
}

Document

DOI: 10.4230/DagSemProc.06201.7

Solving Classical String Problems an Compressed Texts

Authors: Yury Lifshits

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

How to solve string problems, if instead of input string we get only program generating it? Is it possible to solve problems faster than just "generate text + apply classical algorithm"? In this paper we consider strings generated by straight-line programs (SLP). These are programs using only assignment operator. We show new algorithms for equivalence, pattern matching, finding periods and covers, computing fingerprint table on SLP-generated strings. From the other hand, computing the Hamming distance is NP-hard. Main corollary is an $O(n2*m)$ algorithm for pattern matching in LZ-compressed texts.

Cite as

Yury Lifshits. Solving Classical String Problems an Compressed Texts. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{lifshits:DagSemProc.06201.7,
  author =	{Lifshits, Yury},
  title =	{{Solving Classical String Problems an Compressed Texts}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--10},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.7},
  URN =		{urn:nbn:de:0030-drops-7984},
  doi =		{10.4230/DagSemProc.06201.7},
  annote =	{Keywords: Pattern matching, Compressed text}
}

Document

DOI: 10.4230/DagSemProc.06201.1

06201 Abstracts Collection – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery

Authors: Rudolf Ahlswede, Alberto Apostolico, and Vladimir I. Levenshtein

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

From 15.05.06 to 20.05.06, the Dagstuhl Seminar 06201 ``Combinatorial and Algorithmic Foundations of Pattern and Association Discovery'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Cite as

Rudolf Ahlswede, Alberto Apostolico, and Vladimir I. Levenshtein. 06201 Abstracts Collection – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ahlswede_et_al:DagSemProc.06201.1,
  author =	{Ahlswede, Rudolf and Apostolico, Alberto and Levenshtein, Vladimir I.},
  title =	{{06201 Abstracts Collection – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--15},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.1},
  URN =		{urn:nbn:de:0030-drops-7873},
  doi =		{10.4230/DagSemProc.06201.1},
  annote =	{Keywords: Data compression, pattern matching, pattern discovery, search, sorting, molecular biology, reconstruction, genome rearrangements}
}

Document

DOI: 10.4230/DagSemProc.06201.2

06201 Executive Summary – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery

Authors: Rudolf Ahlswede, Alberto Apostolico, and Vladimir I. Levenshtein

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

The goals of this seminar have been (1) to identify and match recently developed methods to specific tasks and data sets in a core of application areas; next, based on feedback from the specific applied domain, (2) to fine tune and personalize those applications, and finally (3) to identify and tackle novel combinatorial and algorithmic problems, in some cases all the way to the development of novel software tools.

Cite as

Rudolf Ahlswede, Alberto Apostolico, and Vladimir I. Levenshtein. 06201 Executive Summary – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ahlswede_et_al:DagSemProc.06201.2,
  author =	{Ahlswede, Rudolf and Apostolico, Alberto and Levenshtein, Vladimir I.},
  title =	{{06201 Executive Summary – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--2},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.2},
  URN =		{urn:nbn:de:0030-drops-7926},
  doi =		{10.4230/DagSemProc.06201.2},
  annote =	{Keywords: Data compression, pattern matching, pattern discovery, search, sorting, molecular biology, reconstruction, genome rearrangements}
}

Document

DOI: 10.4230/DagSemProc.06201.4

Non--binary error correcting codes with noiseless feedback, localized errors, or both

Authors: Rudolf Ahlswede, Christian Deppe, and Vladimir Lebedev

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

We investigate non--binary error correcting codes with noiseless feedback, localized errors, or both. It turns out that the Hamming bound is a central concept. For block codes with feedback we present here a coding scheme based on an idea of erasions, which we call the {\bf rubber method}. It gives an optimal rate for big error correcting fraction $\tau$ ($>{1\over q}$) and infinitely many points on the Hamming bound for small $\tau$. We also consider variable length codes with all lengths bounded from above by $n$ and the end of a word carries the symbol $\Box$ and is thus recognizable by the decoder. For both, the $\Box$-model with feedback and the $\Box$-model with localized errors, the Hamming bound is the exact capacity curve for $\tau <1/2.$ Somewhat surprisingly, whereas with feedback the capacity curve coincides with the Hamming bound also for $1/2\leq \tau \leq 1$, in this range for localized errors the capacity curve equals 0. Also we give constructions for the models with both, feedback and localized errors.

Cite as

Rudolf Ahlswede, Christian Deppe, and Vladimir Lebedev. Non--binary error correcting codes with noiseless feedback, localized errors, or both. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-4, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ahlswede_et_al:DagSemProc.06201.4,
  author =	{Ahlswede, Rudolf and Deppe, Christian and Lebedev, Vladimir},
  title =	{{Non--binary error correcting codes with noiseless feedback, localized errors, or both}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--4},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.4},
  URN =		{urn:nbn:de:0030-drops-7849},
  doi =		{10.4230/DagSemProc.06201.4},
  annote =	{Keywords: Error-correcting codes, localized errors, feedback, variable length codes}
}

Document

DOI: 10.4230/DagSemProc.06201.5

On the Monotonicity of the String Correction Factor for Words with Mismatches

Authors: Alberto Apostolico and Cinzia Pizzi

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

The string correction factor is the term by which the probability of a word $w$ needs to be multiplied in order to account for character changes or ``errors'' occurring in at most $k$ arbitrary positions in that word. The behavior of this factor, as a function of $k$ and of the word length, has implications on the number of candidates that need to be considered and weighted when looking for subwords of a sequence that present unusually recurrent replicas within some bounded number of mismatches. Specifically, it is seen that over intervals of mono- or bi-tonicity for the correction factor, only some of the candidates need be considered. This mitigates the computation and leads to tables of over-represented words that are more compact to represent and inspect. In recent work, expectation and score monotonicity has been established for a number of cases of interest, under {em i.i.d.} probabilistic assumptions. The present paper reviews the cases of bi-tonic behavior for the correction factor, concentrating on the instance in which the question is still open.

Cite as

Alberto Apostolico and Cinzia Pizzi. On the Monotonicity of the String Correction Factor for Words with Mismatches. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{apostolico_et_al:DagSemProc.06201.5,
  author =	{Apostolico, Alberto and Pizzi, Cinzia},
  title =	{{On the Monotonicity of the String Correction Factor for Words with Mismatches}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--9},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.5},
  URN =		{urn:nbn:de:0030-drops-7899},
  doi =		{10.4230/DagSemProc.06201.5},
  annote =	{Keywords: Pattern discovery, Motif, Over-represented word, Monotone score, Correction Factor}
}

Document

DOI: 10.4230/DagSemProc.06201.6

Sequence prediction for non-stationary processes

Authors: Daniil Ryabko and Marcus Hutter

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

We address the problem of sequence prediction for nonstationary stochastic processes. In particular, given two measures on the set of one-way infinite sequences over a finite alphabet, consider the question whether one of the measures predicts the other. We find some conditions on local absolute continuity under which prediction is possible.

Cite as

Daniil Ryabko and Marcus Hutter. Sequence prediction for non-stationary processes. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ryabko_et_al:DagSemProc.06201.6,
  author =	{Ryabko, Daniil and Hutter, Marcus},
  title =	{{Sequence prediction for non-stationary processes}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--12},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.6},
  URN =		{urn:nbn:de:0030-drops-7900},
  doi =		{10.4230/DagSemProc.06201.6},
  annote =	{Keywords: Sequence prediction, probability forecasting, local absolute continuity}
}

Document

DOI: 10.4230/DagSemProc.06201.8

Some Results for Identification for Sources and its Extension to Liar Models

Authors: Zlatko Varbanov

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

Let (${cal U}, P$) be a source, where ${cal U} = {1,2,dots,N}, P = {P_1, P_2, dots, P_N}$, and let ${cal C} = {c_1,c_2,dots,c_N}$ be a binary prefix code (PC) for this source with $||c_u||$ as length of $c_u$. Introduce the random variable $U$ with Prob($U=u$) = $p_u$ for $u = 1,2,dots,N$ and the random variable $C$ with $C = c_u = (c_1,c_2,dots,c_{u||c_u||})$ if $U=u$. We use the PC for noiseless identification, that is user $u$ wants to know whether the source output equals $u$, that is, whether $C$ equals $c_u$ or not. The user iteratively checks whether $C$ coincides with $c_u$ in the first, second, etc. letter and stops when the first different letter occurs or when $C = c_u$. What is the expected number $L_{cal C}(P,u)$ of checkings? In order to calculate this quantity we introduce for the binary tree $T_{cal C}$, whose leaves are the codewords $c_1,c_2,dots,c_N$, the sets of leaves ${cal C}_{ik} (1 leq i leq N; 1 leq k)$, where ${cal C}_{ik} = {c in {cal C}: c$ coincides with $c_i$ exactly until the $k$'th letter of $c_i}$. If $C$ takes a value in ${cal C}_{uk}, 0 leq k leq ||c_u||-1$, the answers are $k$ times "Yes" and 1 time "No". For $C = c_u$ the $$ L_{cal C}(P,u) = sum_{k=0}^{||c_u||-1}P(C in {cal C}_{uk})(k+1) + ||c_u||P_u. $$ For code ${cal C}$,~ $L_{cal C}(P) = max L_{cal C}(P,u)$, $1 geq u geq N$, is the expected number of checkings in the worst case and $L(P) = min L_{cal C}(P)$ is this number for the best code ${cal C}$. Let $P = P^N = {frac{1}{N}, dots, frac{1}{N}}$. We construct a prefix code ${cal C}$ in the following way. In each node (starting at the root) we split the number of remaining codewords in proportion as close as possible to $(frac{1}{2},frac{1}{2})$. It is known that $$ lim_{N ightarrow infty} L_{cal C}(P^N) = 2 $$ (Ahlswede, Balkenhol, Kleinewachter, 2003) We know that $L(P) leq 3$ for all $P$ (Ahlswede, Balkenhol, Kleinewachter, 2003). Also, the problem to estimate an universal constant $A = sup L(P)$ for general $P = (P_1,dots, P_N)$ was stated (Ahlswede, 2004). We compute this constant for uniform distribution and this code ${cal C}$. $$ sup_N L_{cal C}(P^N) = 2+frac{log_2(N-1)-2}{N} $$ Also, we consider the average number of checkings, if code ${cal C}$ is used: $ L_{cal C}(P,P) = sum P_u L_{cal C}(P,u)$, for ${u in {cal U}}$. We calculate the exact values of $L_{cal C}(P^N)$ and $L_{cal C}(P^N,P^N)$ for some $N$. Other problem is the extension of identification for sources to liar models. We obtain a upper bound for the expected number of checkings $L_{cal C}(P^N;e)$, where $e$ is the maximum number of lies. $$ L_{cal C}(P^N;e) leq M_{cal C}(P^N;e) = (e+1)L_{cal C}(P^N) + e; ~~ lim_{N ightarrow infty} M_{cal C}(P^N;e) = 3e+2 $$

Cite as

Zlatko Varbanov. Some Results for Identification for Sources and its Extension to Liar Models. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-4, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{varbanov:DagSemProc.06201.8,
  author =	{Varbanov, Zlatko},
  title =	{{Some Results for Identification for Sources and its Extension to Liar Models}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--4},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.8},
  URN =		{urn:nbn:de:0030-drops-7820},
  doi =		{10.4230/DagSemProc.06201.8},
  annote =	{Keywords: Identification for sources, lies, prefix code}
}

Document

DOI: 10.4230/DagSemProc.06201.9

Suballowable sequences of permutations

Authors: Andrei Asinowski

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

We discuss a notion of "allowable sequence of permutations" and show a few combinatorial results and geometric applications.

Cite as

Andrei Asinowski. Suballowable sequences of permutations. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{asinowski:DagSemProc.06201.9,
  author =	{Asinowski, Andrei},
  title =	{{Suballowable sequences of permutations}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--2},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.9},
  URN =		{urn:nbn:de:0030-drops-7833},
  doi =		{10.4230/DagSemProc.06201.9},
  annote =	{Keywords: Aequences of permutations}
}

Document

DOI: 10.4230/DagSemProc.06201.10

Subwords in reverse-complement order

Authors: Péter L. Erdös, Péter Ligeti, Péter Sziklai, and David C. Torney

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

We examine finite words over an alphabet $Gamma={a,bar{a};b,bar{b}}$ of pairs of letters, where each word $w_1w_2...w_t$ is identical with its {it reverse complement} $bar{w_t}...bar{w_2}bar{w_1}$ (where $bar{bbar{a}}=a,bar{bar{b}}=b$). We seek the smallest $k$ such that every word of length $n,$ composed from $Gamma$, is uniquely determined by the set of its subwords of length up to $k$. Our almost sharp result ($ksim 2n/3$) is an analogue of a classical result for ``normal'' words. This classical problem originally was posed by M.P. Sch"utzenberger and I. Simon, and gained a considerable interest for several researchers, foremost by Vladimir Levenshtein. Our problem has its roots in bioinformatics.

Cite as

Péter L. Erdös, Péter Ligeti, Péter Sziklai, and David C. Torney. Subwords in reverse-complement order. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{erdos_et_al:DagSemProc.06201.10,
  author =	{Erd\"{o}s, P\'{e}ter L. and Ligeti, P\'{e}ter and Sziklai, P\'{e}ter and Torney, David C.},
  title =	{{Subwords in reverse-complement order}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--8},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.10},
  URN =		{urn:nbn:de:0030-drops-7856},
  doi =		{10.4230/DagSemProc.06201.10},
  annote =	{Keywords: Reverse complement order, Reconstruction of words, Microarray experiments}
}

Document

DOI: 10.4230/DagSemProc.06201.11

Vertex reconstruction in Cayley graphs

Authors: Elena Konstantinova

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

In this report paper we collect recent results on the vertex reconstruction in Cayley graphs $Cay(G,S)$. The problem is stated as the problem of reconstructing a vertex from the minimum number of its $r$-neighbors that are vertices at distance at most $r$ from the unknown vertex. The combinatorial properties of Cayley graphs on the symmetric group $Sn$ and the signed permutation group $Bn$ with respect to this problem are presented. The sets of generators of $S$ are specified by applications in coding theory, computer science, molecular biology and physics.

Cite as

Elena Konstantinova. Vertex reconstruction in Cayley graphs. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{konstantinova:DagSemProc.06201.11,
  author =	{Konstantinova, Elena},
  title =	{{Vertex reconstruction in Cayley graphs}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--20},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.11},
  URN =		{urn:nbn:de:0030-drops-7869},
  doi =		{10.4230/DagSemProc.06201.11},
  annote =	{Keywords: Reconstruction problems, Cayley graphs, the symmetric group, the signed permutation group, sorting by reversals, pancake problem}
}

12 Search Results for "Levenshtein, Vladimir I."

A*PA2: Up to 19× Faster Exact Global Alignment

Abstract

Cite as

Local Minimax Learning of Approximately Polynomial Functions

Abstract

Cite as

Solving Classical String Problems an Compressed Texts

Abstract

Cite as

06201 Abstracts Collection – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery

Abstract

Cite as

06201 Executive Summary – Combinatorial and Algorithmic Foundations of Pattern and Association Discovery

Abstract

Cite as

Non--binary error correcting codes with noiseless feedback, localized errors, or both

Abstract

Cite as

On the Monotonicity of the String Correction Factor for Words with Mismatches

Abstract

Cite as

Sequence prediction for non-stationary processes

Abstract

Cite as

Some Results for Identification for Sources and its Extension to Liar Models

Abstract

Cite as

Suballowable sequences of permutations

Abstract

Cite as

Subwords in reverse-complement order

Abstract

Cite as

Vertex reconstruction in Cayley graphs

Abstract

Cite as

Thanks for your feedback!

Could not send message