Abstract
Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of orderpreserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an orderpreserving match if the relative order of their characters is the same: e.g., (4, 1, 3, 2) and (10, 3, 7, 5) are an orderpreserving match. We show how, given a string S[1..n] over an arbitrary alphabet of size sigma and a constant c >=1, we can build an O(n log log n)bit encoding such that later, given a pattern P[1..m] with m >= log^c n, we can return the number of orderpreserving occurrences of P in S in O(m) time. Within the same time bound we can also return the starting position of some orderpreserving match for P in S (if such a match exists). We prove that our space bound is within a constant factor of optimal if log(sigma) = Omega(log log n); our query time is optimal if log(sigma) = Omega(log n). Our space bound contrasts with the Omega(n log n) bits needed in the worst case to store S itself, an index for orderpreserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to O(log^c n) neighbouring characters.
BibTeX  Entry
@InProceedings{gagie_et_al:LIPIcs:2017:7872,
author = {Travis Gagie and Giovanni Manzini and Rossano Venturini},
title = {{An Encoding for OrderPreserving Matching}},
booktitle = {25th Annual European Symposium on Algorithms (ESA 2017)},
pages = {38:138:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770491},
ISSN = {18688969},
year = {2017},
volume = {87},
editor = {Kirk Pruhs and Christian Sohler},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7872},
URN = {urn:nbn:de:0030drops78726},
doi = {10.4230/LIPIcs.ESA.2017.38},
annote = {Keywords: Compact data structures, encodings, orderpreserving matching}
}
Keywords: 

Compact data structures, encodings, orderpreserving matching 
Collection: 

25th Annual European Symposium on Algorithms (ESA 2017) 
Issue Date: 

2017 
Date of publication: 

01.09.2017 