,
Sanjana Dey
,
Elazar Goldenberg
,
Mursalin Habib
,
Bernhard Haeupler
,
Karthik C. S.
,
Michal Koucký
Creative Commons Attribution 4.0 International license
A function φ: {0,1}^n → {0,1}^N is called an isometric embedding of the n-dimensional Hamming metric space to the N-dimensional edit metric space if, for all x, y ∈ {0,1}ⁿ, the Hamming distance between x and y is equal to the edit distance between φ(x) and φ(y). The rate of such an embedding is defined as the ratio n/N.
It is well known in the literature how to construct isometric embeddings with a rate of Ω(1/log n). However, achieving even near-isometric embeddings with a positive constant rate has remained elusive until now.
In this paper, we present an isometric embedding with a rate of 1/8 by discovering connections to synchronization strings, which were studied in the context of insertion-deletion codes (Haeupler-Shahrasbi [JACM'21]). At a technical level, we introduce a framework for obtaining high-rate isometric embeddings using a novel object called a misaligner. We speculate that, with sufficient computational resources, our framework could potentially yield isometric embeddings with a rate of 1/5.
As an immediate consequence of our constant rate isometric embedding, we improve known conditional lower bounds for the closest pair problem and the discrete 1-center problem in the edit metric and NP-hardness of approximation results for clustering problems and the Steiner tree problem in the edit metric, but now with optimal dependency on the dimension. Furthermore, we obtain optimal lower bounds for the gap edit distance problem in the two-player randomized communication complexity model.
We complement our results by showing that no isometric embedding φ:{0,1}^n → {0,1}^N can have rate greater than 15/32 for all positive integers n. En route to proving this upper bound, we uncover fundamental structural properties necessary for every Hamming-to-edit isometric embedding. We also prove similar upper and lower bounds for embeddings over larger alphabets.
Finally, we consider embeddings φ:Σ_in^n → Σ_out^N between different input and output alphabets, where the rate is given by (n log|Σ_in|)/(Nlog|Σ_out|). In this setting, we show that the rate can be made arbitrarily close to 1.
@InProceedings{bhattacharya_et_al:LIPIcs.ICALP.2026.32,
author = {Bhattacharya, Sudatta and Dey, Sanjana and Goldenberg, Elazar and Habib, Mursalin and Haeupler, Bernhard and Karthik C. S. and Kouck\'{y}, Michal},
title = {{Constant Rate Isometric Embeddings of Hamming Metric into Edit Metric}},
booktitle = {53rd International Colloquium on Automata, Languages, and Programming (ICALP 2026)},
pages = {32:1--32:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-428-4},
ISSN = {1868-8969},
year = {2026},
volume = {374},
editor = {Bhattacharya, Sayan and Nanongkai, Danupon and Benedikt, Michael and Puppis, Gabriele},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2026.32},
URN = {urn:nbn:de:0030-drops-264215},
doi = {10.4230/LIPIcs.ICALP.2026.32},
annote = {Keywords: Edit distance, Hamming distance, metric embeddings, synchronization strings, fine-grained complexity}
}