When Is the Normalized Edit Distance over Non-Uniform Weights a Metric?

Fisman, Dana; Tzarfati, Ilay

doi:10.4230/LIPIcs.CPM.2024.14

File

Author Details

Dana Fisman

Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel

Ilay Tzarfati

Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel

Cite AsGet BibTex

Dana Fisman and Ilay Tzarfati. When Is the Normalized Edit Distance over Non-Uniform Weights a Metric?. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.CPM.2024.14

Abstract

The well known Normalized Edit Distance (ned) [Marzal and Vidal 1993] is known to disobey the triangle inequality on contrived weight functions, while in practice it often exhibits a triangular behavior. Let d be a weight function on basic edit operations, and let ned_{d} be the resulting normalized edit distance. The question what criteria should d satisfy for ned_{d} to be a metric is long standing. It was recently shown that when d is the uniform weight function (all operations cost 1 except for no-op which costs 0) then ned_{d} is a metric. The question regarding non-uniform weights remained open. In this paper we answer this question by providing a necessary and sufficient condition on d under which ned_{d} is a metric.

Subject Classification

ACM Subject Classification

Theory of computation → Pattern matching
Theory of computation → Formal languages and automata theory

Keywords

Normalized Edit Distance
Non-uniform Weights
Triangle Inequality
Metric

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

C. Baier and J-P. Katoen. Principles of Model Checking. MIT Press, 2008.
Edmund M. Clarke, Orna Grumberg, Daniel Kroening, Doron A. Peled, and Helmut Veith. Model checking, 2nd Edition. MIT Press, 2018. URL: https://mitpress.mit.edu/books/model-checking-second-edition.
Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-10575-8.
Loris D'Antoni and Margus Veanes. The power of symbolic automata and transducers. In Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I, pages 47-67, 2017.
Colin de la Higuera and Luisa Micó. A contextual normalised edit distance. In Proceedings of the 24th International Conference on Data Engineering Workshops, ICDE 2008, April 7-12, 2008, Cancún, Mexico, pages 354-361. IEEE Computer Society, 2008.
Emmanuel Filiot, Nicolas Mazzocchi, Jean-François Raskin, Sriram Sankaranarayanan, and Ashutosh Trivedi. Weighted transducers for robustness verification. In 31st International Conference on Concurrency Theory, CONCUR 2020, September 1-4, 2020, Vienna, Austria (Virtual Conference), pages 17:1-17:21, 2020.
Dana Fisman, Joshua Grogin, Oded Margalit, and Gera Weiss. The normalized edit distance with uniform operation costs is a metric. In Hideo Bannai and Jan Holub, editors, 33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022, June 27-29, 2022, Prague, Czech Republic, volume 223 of LIPIcs, pages 17:1-17:17, 2022.
Dana Fisman, Joshua Grogin, and Gera Weiss. A normalized edit distance on infinite words. In 31st EACSL Annual Conference on Computer Science Logic, CSL 2023, February 13-16, 2023, Warsaw, Poland, pages 20:1-20:20, 2023.
R. W. Hamming. Error detecting and error correcting codes. The Bell System Technical Journal, 29(2):147-160, April 1950. URL: https://doi.org/10.1002/j.1538-7305.1950.tb00463.x.
Karen Kukich. Techniques for automatically correcting words in text. ACM Comput. Surv., 24(4):377-439, December 1992. URL: https://doi.org/10.1145/146370.146380.
Vladimir Iosifovich Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10(8):707-710, February 1966. Doklady Akademii Nauk SSSR, V163 No4 845-848 1965.
Yujian Li and Bi Liu. A normalized levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell., 29(6):1091-1095, 2007.
Andrés Marzal and Enrique Vidal. Computation of normalized edit distance and applications. IEEE Trans. Pattern Anal. Mach. Intell., 15(9):926-932, 1993.
Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88, March 2001. URL: https://doi.org/10.1145/375360.375365.
Büchi J. R. On a decision method in restricted second order arithmetic. In Int. Congress on Logic, Method, and Philosophy of Science, pages 1-12. Stanford University Press, 1962.
Sanda Zilles. A distance on ℕ. Private communication, 2023.
David Sankoff and Joseph B. Kruskal. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.
Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, January 1974. URL: https://doi.org/10.1145/321796.321811.
Achim Weigel and Frank Fein. Normalizing the weighted edit distance. In 12th IAPR International Conference on Pattern Recognition, Conference B: Patern Recognition and Neural Networks, ICPR 1994, Jerusalem, Israel, 9-13 October, 1994, Volume 2, pages 399-402, 1994.

When Is the Normalized Edit Distance over Non-Uniform Weights a Metric?

Authors Dana Fisman , Ilay Tzarfati

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

When Is the Normalized Edit Distance over Non-Uniform Weights a Metric?

Authors Dana Fisman , Ilay Tzarfati

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message