License:
Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/DagSemProc.06201.5
URN: urn:nbn:de:0030-drops-7899
URL: https://drops.dagstuhl.de/opus/volltexte/2006/789/
Go to the corresponding Portal |
Apostolico, Alberto ;
Pizzi, Cinzia
On the Monotonicity of the String Correction Factor for Words with Mismatches
Abstract
The string correction factor is the term by
which the probability of a word $w$ needs to be multiplied in order
to account for character changes or ``errors'' occurring in at most
$k$ arbitrary positions in that word. The behavior of this factor,
as a function of $k$ and of the word length, has implications on the
number of candidates that need to be considered and weighted when
looking for subwords of a sequence that present unusually recurrent
replicas within some bounded number of mismatches. Specifically, it
is seen that over intervals of mono- or bi-tonicity for the
correction factor, only some of the candidates need be considered.
This mitigates the computation and leads to tables of
over-represented words that are more compact to represent and
inspect. In recent work, expectation and score monotonicity has been
established for a number of cases of interest, under {em i.i.d.}
probabilistic assumptions. The present paper reviews the cases of
bi-tonic behavior for the correction factor, concentrating on the
instance in which the question is still open.
BibTeX - Entry
@InProceedings{apostolico_et_al:DagSemProc.06201.5,
author = {Apostolico, Alberto and Pizzi, Cinzia},
title = {{On the Monotonicity of the String Correction Factor for Words with Mismatches}},
booktitle = {Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
pages = {1--9},
series = {Dagstuhl Seminar Proceedings (DagSemProc)},
ISSN = {1862-4405},
year = {2006},
volume = {6201},
editor = {Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2006/789},
URN = {urn:nbn:de:0030-drops-7899},
doi = {10.4230/DagSemProc.06201.5},
annote = {Keywords: Pattern discovery, Motif, Over-represented word, Monotone score, Correction Factor}
}
Keywords: |
|
Pattern discovery, Motif, Over-represented word, Monotone score, Correction Factor |
Collection: |
|
06201 - Combinatorial and Algorithmic Foundations of Pattern and Association Discovery |
Issue Date: |
|
2006 |
Date of publication: |
|
07.11.2006 |