LIPIcs.OPODIS.2024.36.pdf
- Filesize: 0.84 MB
- 20 pages
In recent years, the explosion of big data and analytics has necessitated distributed storage and processing with several compute nodes (e.g., multiple datacenters). These nodes collaboratively perform parallel computation, where the data is typically partitioned across these nodes to ensure scalability, redundancy and load-balancing. But the nodes may not always be co-located; in many cases, they are part of a larger communication network. Since those nodes only need to communicate among themselves, a key challenge is to design efficient routes catered to that subnetwork. In this work, we initiate the study of distributed sampling and routing problems for subnetworks in any well-connected network. Given any network G = (V, E) with mixing time τ_mix, consider the canonical problem of permutation routing [Ghaffari, Kuhn and Su, PODC 2017] that aims to minimize both congestion and dilation of the routes, where the demands (i.e., set of source-terminal pairs) are such that each node sends or receives number of messages proportional to its degree. We show that the permutation routing problem, when demands are restricted to any subset S ⊆ V (i.e., subnetwork), can be solved in exp(O(√(log|S|))) ⋅ Õ(τ_mix) rounds (where Õ(⋅) hides polylogarithmic factors of |V|). This means that the running time depends subpolynomially on the subnetwork size (i.e., not on the entire network size). The ability to solve permutation routing efficiently immediately implies that a large class of parallel algorithms can be simulated efficiently on the subnetwork. As a prerequisite to constructing efficient routes, we design and analyze distributed branching random walks that distribute tokens started by the nodes in the subnetwork. At a high-level, these algorithms operate by always moving each token according to a (lazy) simple random walk, but also branching a token into multiple tokens at some specified intervals; ultimately, if a node starts a branching walk, with its id in a token, then by the end of execution, several tokens with its id would be randomly distributed among the nodes. As these random walks can be started by many nodes, a crucial challenge is to ensure low-congestion, which is a primary focus of this paper.
Feedback for Dagstuhl Publishing