Document exchange and error correcting codes are two fundamental problems regarding communications. In the first problem, Alice and Bob each holds a string, and the goal is for Alice to send a short sketch to Bob, so that Bob can recover Alice’s string. In the second problem, Alice sends a message with some redundant information to Bob through a channel that can add adversarial errors, and the goal is for Bob to correctly recover the message despite the errors. In both problems, an upper bound is placed on the number of errors between the two strings or that the channel can add, and a major goal is to minimize the size of the sketch or the redundant information. In this paper we focus on deterministic document exchange protocols and binary error correcting codes.

Both problems have been studied extensively. In the case of Hamming errors (i.e., bit substitutions) and bit erasures, we have explicit constructions with asymptotically optimal parameters. However, other error types are still rather poorly understood. In a recent work [Kuan Cheng et al., 2018], the authors constructed explicit deterministic document exchange protocols and binary error correcting codes for edit errors with almost optimal parameters. Unfortunately, the constructions in [Kuan Cheng et al., 2018] do not work for other common errors such as block transpositions.

In this paper, we generalize the constructions in [Kuan Cheng et al., 2018] to handle a much larger class of errors. These include bursts of insertions and deletions, as well as block transpositions. Specifically, we consider document exchange and error correcting codes where the total number of block insertions, block deletions, and block transpositions is at most k <= alpha n/log n for some constant 0<alpha<1. In addition, the total number of bits inserted and deleted by the first two kinds of operations is at most t <= beta n for some constant 0<beta<1, where n is the length of Alice’s string or message. We construct explicit, deterministic document exchange protocols with sketch size O((k log n +t) log^2 n/{k log n + t}) and explicit binary error correcting code with O(k log n log log log n+t) redundant bits. As a comparison, the information-theoretic optimum for both problems is Theta(k log n+t). As far as we know, previously there are no known explicit deterministic document exchange protocols in this case, and the best known binary code needs Omega(n) redundant bits even to correct just one block transposition [L. J. Schulman and D. Zuckerman, 1999].