The optimal sequence compression

Andreev, Alexander E.

doi:10.4230/DagSemProc.06111.19

Abstract

This paper presents the optimal compression for sequences with undefined values. Let we have $(N-m)$ undefined and $m$ defined positions in the boolean sequence $vv V$ of length $N$. The sequence code length can't be less then $m$ in general case, otherwise at least two sequences will have the same code. We present the coding algorithm which generates codes of almost $m$ length, i.e. almost equal to the lower bound. The paper presents the decoding circuit too. The circuit has low complexity which depends from the inverse density of defined values $D(vv V) = frac{N}{m}$. The decoding circuit includes RAM and random logic. It performs sequential decoding. The total RAM size is proportional to the $$logleft(D(vv V) ight) ,$$ the number of random logic cells is proportional to $$log logleft(D(vv V) ight) * left(log log logleft(D(vv V) ight) ight)^2 .$$ So the decoding circuit will be small enough even for the very low density sequences. The decoder complexity doesn't depend of the sequence length at all.

The optimal sequence compression

Author Alexander E. Andreev

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

Thanks for your feedback!

Could not send message