License
when quoting this document, please refer to the following
URN: urn:nbn:de:0030-drops-7899
URL: http://drops.dagstuhl.de/opus/volltexte/2006/789/

Apostolico, Alberto ; Pizzi, Cinzia

On the Monotonicity of the String Correction Factor for Words with Mismatches

pdf-format:
Dokument 1.pdf (170 KB)


Abstract

The string correction factor is the term by which the probability of a word $w$ needs to be multiplied in order to account for character changes or ``errors'' occurring in at most $k$ arbitrary positions in that word. The behavior of this factor, as a function of $k$ and of the word length, has implications on the number of candidates that need to be considered and weighted when looking for subwords of a sequence that present unusually recurrent replicas within some bounded number of mismatches. Specifically, it is seen that over intervals of mono- or bi-tonicity for the correction factor, only some of the candidates need be considered. This mitigates the computation and leads to tables of over-represented words that are more compact to represent and inspect. In recent work, expectation and score monotonicity has been established for a number of cases of interest, under {em i.i.d.} probabilistic assumptions. The present paper reviews the cases of bi-tonic behavior for the correction factor, concentrating on the instance in which the question is still open.

BibTeX - Entry

@InProceedings{apostolico_et_al:DSP:2006:789,
  author =	{Alberto Apostolico and Cinzia Pizzi},
  title =	{On the Monotonicity of the String Correction Factor for Words with Mismatches},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  year =	{2006},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  number =	{06201},
  series =	{Dagstuhl Seminar Proceedings},
  ISSN =	{1862-4405},
  publisher =	{Internationales Begegnungs- und Forschungszentrum f{\"u}r Informatik (IBFI), Schloss Dagstuhl, Germany},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2006/789},
  annote =	{Keywords: Pattern discovery, Motif, Over-represented word, Monotone score, Correction Factor}
}

Keywords: Pattern discovery, Motif, Over-represented word, Monotone score, Correction Factor
Seminar: 06201 - Combinatorial and Algorithmic Foundations of Pattern and Association Discovery
Issue date: 2006
Date of publication: 07.11.2006


DROPS-Home | Fulltext Search | Imprint Published by LZI