,
Yifan Zhang
,
Wei Zhang
Creative Commons Attribution 4.0 International license
Broadcasting and aggregation dominate the communication overhead in distributed systems, from machine learning training to data analytics. Current acceleration approaches require specialized hardware (RDMA) or dedicated resources (DPDK), limiting their deployment in commodity clouds. However, we present a counter-intuitive alternative: rather than bypassing the kernel, we move operations into it using eBPF. While this imposes severe constraints including no floating-point, limited memory, and stateless execution, we show these restrictions paradoxically drive innovative protocol designs that yield unexpected benefits. We introduce AggBox, which implements broadcast and aggregation operations entirely within eBPF’s constrained environment. Our key innovations include stateless group acknowledgments for reliability, edge quantization for floating-point aggregation using only integer arithmetic, and tail-call chains that create virtual memory beyond eBPF’s 512-byte stack limit. These designs emerge from and exploit the constraints rather than fighting them. AggBox achieves remarkable performance on commodity hardware: 84.5% reduction in broadcast latency, 43× speedup for MapReduce workloads, and 56.1% faster ML gradient aggregation, all without specialized NICs or dedicated cores. Beyond performance, our work demonstrates that constrained environments can drive fundamental innovation in protocol design, offering insights for future resource-limited and verified systems.
@InProceedings{su_et_al:OASIcs.NINeS.2026.13,
author = {Su, Jianchang and Zhang, Yifan and Zhang, Wei},
title = {{In-Kernel Aggregation and Broadcast Acceleration for Distributed Communication}},
booktitle = {1st New Ideas in Networked Systems (NINeS 2026)},
pages = {13:1--13:23},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-414-7},
ISSN = {2190-6807},
year = {2026},
volume = {139},
editor = {Argyraki, Katerina and Panda, Aurojit},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.NINeS.2026.13},
URN = {urn:nbn:de:0030-drops-255981},
doi = {10.4230/OASIcs.NINeS.2026.13},
annote = {Keywords: eBPF, distributed communication, broadcast, aggregation, in-kernel processing, XDP}
}