Day, Joel D. ;
Fleischmann, Pamela ;
Kosche, Maria ;
Koß, Tore ;
Manea, Florin ;
Siemer, Stefan
The Edit Distance to kSubsequence Universality
Abstract
A word u is a subsequence of another word w if u can be obtained from w by deleting some of its letters. In the early 1970s, Imre Simon defined the relation ∼_k (called now SimonCongruence) as follows: two words having exactly the same set of subsequences of length at most k are ∼_kcongruent. This relation was central in defining and analysing piecewise testable languages, but has found many applications in areas such as algorithmic learning theory, databases theory, or computational linguistics. Recently, it was shown that testing whether two words are ∼_kcongruent can be done in optimal linear time. Thus, it is a natural next step to ask, for two words w and u which are not ∼_kequivalent, what is the minimal number of edit operations that we need to perform on w in order to obtain a word which is ∼_kequivalent to u.
In this paper, we consider this problem in a setting which seems interesting: when u is a ksubsequence universal word. A word u with alph(u) = Σ is called ksubsequence universal if the set of subsequences of length k of u contains all possible words of length k over Σ. As such, our results are a series of efficient algorithms computing the edit distance from w to the language of ksubsequence universal words.
BibTeX  Entry
@InProceedings{day_et_al:LIPIcs.STACS.2021.25,
author = {Day, Joel D. and Fleischmann, Pamela and Kosche, Maria and Ko{\ss}, Tore and Manea, Florin and Siemer, Stefan},
title = {{The Edit Distance to kSubsequence Universality}},
booktitle = {38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
pages = {25:125:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771801},
ISSN = {18688969},
year = {2021},
volume = {187},
editor = {Bl\"{a}ser, Markus and Monmege, Benjamin},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/13670},
URN = {urn:nbn:de:0030drops136705},
doi = {10.4230/LIPIcs.STACS.2021.25},
annote = {Keywords: Subsequence, Scattered factor, Subword, Universality, ksubsequence universality, Edit distance, Efficient algorithms}
}
10.03.2021
Keywords: 

Subsequence, Scattered factor, Subword, Universality, ksubsequence universality, Edit distance, Efficient algorithms 
Seminar: 

38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)

Issue date: 

2021 
Date of publication: 

10.03.2021 