Abstract
Wheeler nondeterministic finite automata (WNFAs) were introduced in (Gagie et al., TCS 2017) as a powerful generalization of prefix sorting from strings to labeled graphs. WNFAs admit optimal solutions to classic hard problems on labeled graphs and languages such as compression and regular expression matching. The problem of deciding whether a given NFA is Wheeler is known to be NPcomplete (Gibney and Thankachan, ESA 2019). Recently, however, Alanko et al. (Information and Computation 2021) showed how to sidestep this complexity by switching to preorders: letting Q be the set of states and δ the set of transitions, they provided a O(δ⋅Q²)time algorithm computing a totallyordered partition (i.e. equivalence relation) of the WNFA’s states such that (1) equivalent states recognize the same regular language, and (2) the order of (the classes of) nonequivalent states is consistent with any Wheeler order, when one exists. As a result, the output is a preorder of the states as useful for pattern matching as standard Wheeler orders.
Further extensions of this line of work (Cotumaccio et al., SODA 2021 and DCC 2022) generalized these concepts to arbitrary NFAs by introducing colex partial preorders: in general, any NFA admits a partial preorder of its states reflecting the colexicographic order of their accepted strings; the smaller the width of such preorder is, the faster regular expression matching queries can be performed. To date, the fastest algorithm for computing the smallestwidth partial preorder on NFAs runs in O(δ² + Q^{5/2}) time (Cotumaccio, DCC 2022), while on DFAs the same task can be accomplished in O(min(Q²logQ, δ⋅Q)) time (Kim et al., CPM 2023).
In this paper, we provide much more efficient solutions to the colex order computation problem. Our results are achieved by extending a classic algorithm for the relational coarsest partition refinement problem of Paige and Tarjan to work with ordered partitions. More specifically, we provide a O(δlogQ)time algorithm computing a colex total preorder when the input is a Wheeler NFA, and an algorithm with the same time complexity computing the smallestwidth colex partial order of any DFA. In addition, we present implementations of our algorithms and show that they are very efficient also in practice.
BibTeX  Entry
@InProceedings{becker_et_al:LIPIcs.ESA.2023.15,
author = {Becker, Ruben and C\'{a}ceres, Manuel and Cenzato, Davide and Kim, SungHwan and Kodric, Bojana and Olivares, Francisco and Prezza, Nicola},
title = {{Sorting Finite Automata via Partition Refinement}},
booktitle = {31st Annual European Symposium on Algorithms (ESA 2023)},
pages = {15:115:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772952},
ISSN = {18688969},
year = {2023},
volume = {274},
editor = {G{\o}rtz, Inge Li and FarachColton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/18668},
URN = {urn:nbn:de:0030drops186684},
doi = {10.4230/LIPIcs.ESA.2023.15},
annote = {Keywords: Wheeler automata, prefix sorting, pattern matching, graph compression, sorting, partition refinement}
}
Keywords: 

Wheeler automata, prefix sorting, pattern matching, graph compression, sorting, partition refinement 
Collection: 

31st Annual European Symposium on Algorithms (ESA 2023) 
Issue Date: 

2023 
Date of publication: 

30.08.2023 
Supplementary Material: 

Software (Source Code): https://github.com/regindex/finiteautomatapartitionrefinement 