The Normalized Edit Distance with Uniform Operation Costs Is a Metric

Authors Dana Fisman, Joshua Grogin, Oded Margalit, Gera Weiss



PDF
Thumbnail PDF

File

LIPIcs.CPM.2022.17.pdf
  • Filesize: 0.81 MB
  • 17 pages

Document Identifiers

Author Details

Dana Fisman
  • Dept. of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
Joshua Grogin
  • Dept. of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
Oded Margalit
  • Dept. of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
Gera Weiss
  • Dept. of Computer Science, Ben-Gurion University, Beer-Sheva, Israel

Cite AsGet BibTex

Dana Fisman, Joshua Grogin, Oded Margalit, and Gera Weiss. The Normalized Edit Distance with Uniform Operation Costs Is a Metric. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 17:1-17:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.CPM.2022.17

Abstract

We prove that the normalized edit distance proposed in [Marzal and Vidal 1993] is a metric when the cost of all the edit operations are the same. This closes a long standing gap in the literature where several authors noted that this distance does not satisfy the triangle inequality in the general case, and that it was not known whether it is satisfied in the uniform case - where all the edit costs are equal. We compare this metric to two normalized metrics proposed as alternatives in the literature, when people thought that Marzal’s and Vidal’s distance is not a metric, and identify key properties that explain why the original distance, now known to also be a metric, is better for some applications. Our examination is from a point of view of formal verification, but the properties and their significance are stated in an application agnostic way.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • edit distance
  • normalized distance
  • triangle inequality
  • metric

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Abdullah N Arslan and Omer Egecioglu. Efficient algorithms for normalized edit distance. Journal of Discrete Algorithms, 1(1):3-20, 2000. Google Scholar
  2. Colin de la Higuera and Luisa Micó. A contextual normalised edit distance. In Proceedings of the 24th International Conference on Data Engineering Workshops, ICDE 2008, April 7-12, 2008, Cancún, Mexico, pages 354-361. IEEE Computer Society, 2008. Google Scholar
  3. Emmanuel Filiot, Nicolas Mazzocchi, Jean-François Raskin, Sriram Sankaranarayanan, and Ashutosh Trivedi. Weighted transducers for robustness verification. In 31st International Conference on Concurrency Theory, CONCUR 2020, September 1-4, 2020, Vienna, Austria (Virtual Conference), pages 17:1-17:21, 2020. Google Scholar
  4. Dana Fisman, Joshua Grogin, Oded Margalit, and Gera Weiss. The normalized edit distance with uniform operation costs is a metric. CoRR, abs/2201.06115, 2022. URL: http://arxiv.org/abs/2201.06115.
  5. Vladimir Iosifovich Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10(8):707-710, February 1966. Doklady Akademii Nauk SSSR, V163 No4 845-848 1965. Google Scholar
  6. Yujian Li and Bi Liu. A normalized levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell., 29(6):1091-1095, 2007. Google Scholar
  7. Christopher C. Little. URL: https://abydos.readthedocs.io/en/latest/abydos.distance.html#abydos.distance.HigueraMico.
  8. Andrés Marzal and Enrique Vidal. Computation of normalized edit distance and applications. IEEE Trans. Pattern Anal. Mach. Intell., 15(9):926-932, 1993. Google Scholar
  9. Enrique Vidal, Andrés Marzal, and Pablo Aibar. Fast computation of normalized edit distances. IEEE Trans. Pattern Anal. Mach. Intell., 17(9):899-902, 1995. URL: https://doi.org/10.1109/34.406656.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail