What You Must Remember When Transforming Datawords

Author M. Praveen

Thumbnail PDF


  • Filesize: 462 kB
  • 14 pages

Document Identifiers

Author Details

M. Praveen
  • Chennai Mathematical Institute, India
  • UMI ReLaX, Indo-French joint research unit


The author thanks C. Aiswarya, Kamal Lodaya, K. Narayan Kumar and anonymous reviewers for suggestions to improve the presentation and pointers to related works.

Cite AsGet BibTex

M. Praveen. What You Must Remember When Transforming Datawords. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 182, pp. 55:1-55:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Streaming Data String Transducers (SDSTs) were introduced to model a class of imperative and a class of functional programs, manipulating lists of data items. These can be used to write commonly used routines such as insert, delete and reverse. SDSTs can handle data values from a potentially infinite data domain. The model of Streaming String Transducers (SSTs) is the fragment of SDSTs where the infinite data domain is dropped and only finite alphabets are considered. SSTs have been much studied from a language theoretical point of view. We introduce data back into SSTs, just like data was introduced to finite state automata to get register automata. The result is Streaming String Register Transducers (SSRTs), which is a subclass of SDSTs. We use origin semantics for SSRTs and give a machine independent characterization, along the lines of Myhill-Nerode theorem. Machine independent characterizations for similar models are the basis of learning algorithms and enable us to understand fragments of the models. Origin semantics of transducers track which positions of the output originate from which positions of the input. Although a restriction, using origin semantics is well justified and is known to simplify many problems related to transducers. We use origin semantics as a technical building block, in addition to characterizations of deterministic register automata. However, we need to build more on top of these to overcome some challenges unique to SSRTs.

Subject Classification

ACM Subject Classification
  • Theory of computation → Transducers
  • Theory of computation → Automata over infinite objects
  • Streaming String Transducers
  • Data words
  • Machine independent characterization


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. R. Alur and P. Černý. Expressiveness of streaming string transducers. In FSTTCS 2010, volume 8 of LIPIcs, pages 1-12. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2010. URL: https://doi.org/10.4230/LIPIcs.FSTTCS.2010.1.
  2. R. Alur and P. Černý. Streaming transducers for algorithmic verification of single-pass list-processing programs. In POPL 2011, POPL, pages 1-12. ACM, 2011. Google Scholar
  3. D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87-106, 1987. Google Scholar
  4. M. Benedikt, C. Ley, and G. Puppis. What you must remember when processing data words. In Proceedings of the 4th Alberto Mendelzon International Workshop on Foundations of Data Management, Argentina, volume 619 of CEUR Workshop Proceedings, 2010. Google Scholar
  5. M. Bojańczyk. Transducers with origin information. In ICALP, volume 8573 of LNCS, pages 26-37, Berlin, Heidelberg, 2014. Springer. Google Scholar
  6. M. Bojańczyk and R. Stefański. Single-Use Automata and Transducers for Infinite Alphabets. In ICALP 2020, volume 168 of Leibniz International Proceedings in Informatics (LIPIcs), pages 113:1-113:14, Dagstuhl, Germany, 2020. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.ICALP.2020.113.
  7. Sougata Bose, Anca Muscholl, Vincent Penelle, and Gabriele Puppis. Origin-equivalence of two-way word transducers is in PSPACE. In Sumit Ganguly and Paritosh K. Pandya, editors, 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2018, December 11-13, 2018, Ahmedabad, India, volume 122 of LIPIcs, pages 22:1-22:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018. URL: https://doi.org/10.4230/LIPIcs.FSTTCS.2018.22.
  8. S. Cassel, F. Howar, B. Jonsson, M. Merten, and B. Steffen. A succinct canonical register automaton model. Journal of Logical and Algebraic Methods in Programming, 84(1):54-66, 2015. Special Issue: The 23rd Nordic Workshop on Programming Theory (NWPT 2011) Special Issue: Domains X, International workshop on Domain Theory and applications, Swansea, 5-7 September, 2011. Google Scholar
  9. S. Cassel, B. Jonsson, F. Howar, and B. Steffen. A succinct canonical register automaton model for data domains with binary relations. In Automated Technology for Verification and Analysis - 10th International Symposium, 2012, Proceedings, pages 57-71, 2012. URL: https://doi.org/10.1007/978-3-642-33386-6_6.
  10. Y-F Chen, O. Lengál, T. Tan, and Z. Wu. Register automata with linear arithmetic. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, pages 1-12. IEEE Computer Society, 2017. Google Scholar
  11. M. Chytil and V. Jákl. Serial composition of 2-way finite-state transducers and simple programs on strings. In Automata, Languages and Programming, Fourth Colloquium, University of Turku, Finland, July 18-22, 1977, Proceedings, pages 135-147. Springer Berlin Heidelberg, 1977. Google Scholar
  12. A. Durand-Gasselin and P. Habermehl. Regular transformations of data words through origin information. In B. Jacobs and C. Löding, editors, Foundations of Software Science and Computation Structures, pages 285-300, Berlin, Heidelberg, 2016. Springer Berlin Heidelberg. Google Scholar
  13. Léo Exibard, Emmanuel Filiot, and Pierre-Alain Reynier. Synthesis of data word transducers. In Wan J. Fokkink and Rob van Glabbeek, editors, 30th International Conference on Concurrency Theory, CONCUR 2019, August 27-30, 2019, Amsterdam, the Netherlands, volume 140 of LIPIcs, pages 24:1-24:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2019.24.
  14. Léo Exibard, Emmanuel Filiot, and Pierre-Alain Reynier. On computability of data word functions defined by transducers. In Jean Goubault-Larrecq and Barbara König, editors, Foundations of Software Science and Computation Structures - 23rd International Conference, FOSSACS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, volume 12077 of Lecture Notes in Computer Science, pages 217-236. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-45231-5_12.
  15. N. Francez and M. Kaminski. An algebraic characterization of deterministic regular languages over infinite alphabets. Theoretical Computer Science, 306:155-175, 2003. Google Scholar
  16. F. Howar, M. Isberner, B. Steffen, O. Bauer, and B. Jonsson. Inferring semantic interfaces of data structures. In T. Margaria and B. Steffen, editors, Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change, pages 554-571, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. Google Scholar
  17. F. Howar, B. Steffen, B. Jonsson, and S. Cassel. Inferring canonical register automata. In V. Kuncak and A. Rybalchenko, editors, Verification, Model Checking, and Abstract Interpretation, pages 251-266, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. Google Scholar
  18. M. Kaminski and N. Francez. Finite-memory automata. Theoretical Computer Science, 134(2):329-363, 1994. Google Scholar
  19. J. Moerman, M. Sammartino, A. Silva, B. Klin, and M. Szynwelski. Learning nominal automata. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017, pages 613-625, 2017. Google Scholar