,
Karl Michael Göschka
Creative Commons Attribution 4.0 International license
Real-time collaborative programming tools synchronize source code as text, propagating keystrokes or text patches to other collaborators. This propagation of unstructured text often leads to syntactically invalid states, because edits take place by character position rather than by syntactic entity. Consequently, our key idea is to propagate syntactically valid changes only. This paper contributes a structure-aware synchronization substrate based on two complementary representations and a propagation algorithm: (i) A Lossless Syntax Tree stores source code in structured form while preserving program trivia, like whitespace and comments. This is necessary because collaborators must be able to reconstruct byte-identical source text from propagated (structural) changes; (ii) A Stable Syntax Tree extends this representation with persistent node identifiers to enable robust structural diffing between successive versions; (iii) Our propagation algorithm derives deterministic structural edit scripts for the following operations: insert, delete, move, and update. The algorithm can be used across grammars, because a lightweight per-language specification guides the stable reuse of node identifiers. The biggest achievement of our approach is to take unstructured text changes and extract structural edit operations that provide syntactically correct source code changes. We formalize our proposed representations, show how diffing extracts structural edits, and how these edit scripts are applied at the collaborator. Particularly complex is the resulting move of subtrees. Our approach minimizes within-parent move noise using a per-parent Longest Increasing Subsequence. We evaluate our approach using two languages (Java and JavaScript), three file sizes (small/medium/large), and five edit scenarios. Across all scenarios we observe byte-identical collaboration, node identity stability, and deterministic edit scripts. We demonstrate that applying Longest Increasing Subsequence is necessary for canonical minimality under sibling moves. We furthermore demonstrate that tree diffing cost is structure-sensitive: per-node cost increases with sibling fanout rather than depth. 95th percentile (p95) of end-to-end latencies meet the ≤ 1 second delay budget for small and medium files in both languages. Large Java is near 1 second (p95 ≈ 1.22 seconds) while JavaScript exceeds the 2 seconds hard-cap (p95 ≈ 2.86 seconds). Overall, our approach provides a deterministic, language-portable substrate for structure-aware real-time collaborative programming that separates structural propagation from unstructured keystrokes to preserve code correctness and developer intent.
@InProceedings{freudenthaler_et_al:LIPIcs.ECOOP.2026.5,
author = {Freudenthaler, Leon and G\"{o}schka, Karl Michael},
title = {{A Stable Lossless Syntax Tree for Real-Time Collaborative Programming}},
booktitle = {40th European Conference on Object-Oriented Programming (ECOOP 2026)},
pages = {5:1--5:29},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-423-9},
ISSN = {1868-8969},
year = {2026},
volume = {372},
editor = {Krebbers, Robbert and Silva, Alexandra},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2026.5},
URN = {urn:nbn:de:0030-drops-261017},
doi = {10.4230/LIPIcs.ECOOP.2026.5},
annote = {Keywords: real-time collaborative programming, tree-based operations, structure-aware propagation, synchronous collaboration systems}
}