An Algebraic Approach to Vectorial Programs

Authors Charles Paperman , Sylvain Salvati, Claire Soyez-Martin



PDF
Thumbnail PDF

File

LIPIcs.STACS.2023.51.pdf
  • Filesize: 0.99 MB
  • 23 pages

Document Identifiers

Author Details

Charles Paperman
  • Univ. Lille, CNRS, INRIA, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France
Sylvain Salvati
  • Univ. Lille, CNRS, INRIA, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France
Claire Soyez-Martin
  • Univ. Lille, CNRS, INRIA, Centrale Lille, UMR 9189 CRIStAL, F-59000 Lille, France

Acknowledgements

We would like to thank the anonymous referees for helping to improve the paper a lot, Howard Straubing for proof-reading the main algebraic proofs of this paper, Michaël Hauspie for his advice, his support and his knowledge about SIMD and compilation and Corentin Barloy to have helped proof-read the paper.

Cite As Get BibTex

Charles Paperman, Sylvain Salvati, and Claire Soyez-Martin. An Algebraic Approach to Vectorial Programs. In 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 254, pp. 51:1-51:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.STACS.2023.51

Abstract

Vectorial programming, the combination of SIMD instructions with usual processor instructions, is known to speed-up many standard algorithms. Simple regular languages have benefited from this technology. This paper is a first step towards pushing these benefits further. We take advantage of the inner algebraic structure of regular languages and produce high level representations of efficient vectorial programs that recognize certain classes of regular languages.
As a technical ingredient, we establish equivalences between classes of vectorial circuits and logical formalisms, namely unary temporal logic and first order logic. The main result is the construction of compilation procedures that turns syntactic semigroups into vectorial circuits. The circuits we obtain are small in that they improve known upper-bounds on representations of automata within the logical formalisms. The gain is mostly due to a careful sharing of sub-formulas based on algebraic tools.

Subject Classification

ACM Subject Classification
  • Theory of computation → Formal languages and automata theory
Keywords
  • Automata theory
  • Semigroups
  • Vectorisation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Anne Bergeron and Sylvie Hamel. Cascade decompositions are bit-vector algorithms. In International Conference on Implementation and Application of Automata, pages 13-26. Springer, 2001. Google Scholar
  2. Anne Bergeron and Sylvie Hamel. Vector algorithms for approximate string matching. International Journal of Foundations of Computer Science, 13(01):53-65, 2002. Google Scholar
  3. Mikolaj Bojańczyk. Factorization forests. In International Conference on Developments in Language Theory, pages 1-17. Springer, 2009. Google Scholar
  4. Robert D Cameron, Thomas C Shermer, Arrvindh Shriraman, Kenneth S Herdy, Dan Lin, Benjamin R Hull, and Meng Lin. Bitwise data parallelism in regular expression matching. In 2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT), pages 139-150. IEEE, 2014. Google Scholar
  5. V. Diekert, P. Gastin, and M. Kufleitner. A survey on small fragments of first-order logic over finite words. Internat. J. Found. Comput. Sci., 19:513-548, 2008. Google Scholar
  6. Michael Farrar. Striped Smith–Waterman speeds database searches six times over other SIMD implementations. Bioinformatics, 23(2):156-161, November 2006. Google Scholar
  7. Nathanaël Fijalkow and Charles Paperman. Monadic second-order logic with arbitrary monadic predicates. ACM Transactions on Computational Logic (TOCL), 18(3):1-17, 2017. Google Scholar
  8. Tobias Grosser, Hongbin Zheng, Raghesh Aloor, Andreas Simbürger, Armin Größlinger, and Louis-Noël Pouchet. Polly-polyhedral optimization in LLVM. In Proceedings of the First International Workshop on Polyhedral Compilation Techniques (IMPACT), volume 2011, page 1, 2011. Google Scholar
  9. Johan Anthony Wilem Kamp. Tense logic and the theory of linear order. University of California, Los Angeles, 1968. Google Scholar
  10. John Keiser and Daniel Lemire. Validating UTF-8 in less than one instruction per byte. Software: Practice and Experience, 51(5):950-964, 2021. Google Scholar
  11. Donald Ervin Knuth. The art of computer programming: Bitwise tricks & techniques. Binary Decision Diagrams, 4, 2009. Google Scholar
  12. M Oguzhan Külekci. Filter based fast matching of long patterns by using SIMD instructions. In Stringology, pages 118-128, 2009. Google Scholar
  13. Leslie Lamport. Multiple byte processing with full-word instructions. Communications of the ACM, 18(8):471-475, 1975. Google Scholar
  14. Geoff Langdale and Daniel Lemire. Parsing gigabytes of JSON per second. The VLDB Journal, 28(6):941-960, 2019. Google Scholar
  15. McNaughton, Robert, Papert, and Seymour A. Counter-Free Automata (MIT research monograph no. 65). The MIT Press, 1971. Google Scholar
  16. Filip Murlak, Charles Paperman, and Michal Pilipczuk. Schema validation via streaming circuits. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 237-249, 2016. Google Scholar
  17. Gene Myers. A fast bit-vector algorithm for approximate string matching based on dynamic programming. Journal of the ACM (JACM), 46(3):395-415, 1999. Google Scholar
  18. Dorit Nuzman, Sergei Dyshel, Erven Rohou, Ira Rosen, Kevin Williams, David Yuste, Albert Cohen, and Ayal Zaks. Vapor SIMD: Auto-vectorize once, run everywhere. In International Symposium on Code Generation and Optimization (CGO 2011), pages 151-160. IEEE, 2011. Google Scholar
  19. Dorit Nuzman and Ayal Zaks. Autovectorization in GCC-two years later. In Proceedings of the 2006 GCC Developers Summit, volume 6, 2006. Google Scholar
  20. Dorit Nuzman and Ayal Zaks. Outer-loop vectorization: revisited for short SIMD architectures. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 2-11, 2008. Google Scholar
  21. Charles Paperman, Sylvain Salvati, and Claire Soyez-Martin. Addition Lemma, September 2022. URL: https://hal.archives-ouvertes.fr/hal-03787033.
  22. Charles Paperman, Sylvain Salvati, and Claire Soyez-Martin. An algebraic approach to vectorial programs. Complete version of the paper, January 2023. URL: https://hal.archives-ouvertes.fr/hal-03831752v2.
  23. J.E. Pin and European Mathematical Society Publishing House ETH-Zentrum SEW A27. Handbook of Automata Theory: Volume I: Theoretical Foundations; Volume II: Automata in Mathematics and Selected Applications. EMS Press, 2021. Google Scholar
  24. Jean Eric Pin. Varieties of formal languages, volume 184. Springer, 1986. Google Scholar
  25. M. P. Schützenberger. On finite monoids having only trivial subgroups. Information and control, 8:190-194, 1965. Google Scholar
  26. Olivier Serre. Vectorial languages and linear temporal logic. Theoretical computer science, 310(1-3):79-116, 2004. Google Scholar
  27. Howard Straubing. Finite semigroup varieties of the form V*D. Journal of Pure and Applied Algebra, 36:53-94, 1985. Google Scholar
  28. Howard Straubing. Finite automata, formal logic, and circuit complexity. Springer Science & Business Media, 2012. Google Scholar
  29. Pascal Tesson and Denis Thérien. Diamonds are forever: The variety DA. In Semigroups, algorithms, automata and languages, pages 475-499. World Scientific, 2002. Google Scholar
  30. Denis Therien and Pascal Tesson. Logic meets algebra: the case of regular languages. Logical Methods in Computer Science, 3, 2007. Google Scholar
  31. Denis Thérien and Thomas Wilke. Over words, two variables are as powerful as one quantifier alternation. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, pages 234-240, 1998. Google Scholar
  32. Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Ayal Zaks, and Ira Rosen. Polyhedral-model guided loop-nest auto-vectorization. In 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pages 327-337. IEEE, 2009. Google Scholar
  33. Xiang Wang, Yang Hong, Harry Chang, KyoungSoo Park, Geoff Langdale, Jiayu Hu, and Heqing Zhu. Hyperscan: A fast multi-pattern regex matcher for modern CPUs. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), pages 631-648, 2019. Google Scholar
  34. Philipp Weis and Neil Immerman. Structure theorem and strict alternation hierarchy for FO² on words. In International Workshop on Computer Science Logic, pages 343-357. Springer, 2007. Google Scholar
  35. Thomas Wilke. Classifying discrete temporal properties. In Annual symposium on theoretical aspects of computer science, pages 32-46. Springer, 1999. Google Scholar
  36. James Worrell, Rastislav Lenhardt, and Michael Benedikt. Two variable vs. linear temporal logic in model checking and games. Logical Methods in Computer Science, 9, 2013. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail