Search Results

Documents authored by Zhang, Yifan


Document
In-Kernel Aggregation and Broadcast Acceleration for Distributed Communication

Authors: Jianchang Su, Yifan Zhang, and Wei Zhang

Published in: OASIcs, Volume 139, 1st New Ideas in Networked Systems (NINeS 2026)


Abstract
Broadcasting and aggregation dominate the communication overhead in distributed systems, from machine learning training to data analytics. Current acceleration approaches require specialized hardware (RDMA) or dedicated resources (DPDK), limiting their deployment in commodity clouds. However, we present a counter-intuitive alternative: rather than bypassing the kernel, we move operations into it using eBPF. While this imposes severe constraints including no floating-point, limited memory, and stateless execution, we show these restrictions paradoxically drive innovative protocol designs that yield unexpected benefits. We introduce AggBox, which implements broadcast and aggregation operations entirely within eBPF’s constrained environment. Our key innovations include stateless group acknowledgments for reliability, edge quantization for floating-point aggregation using only integer arithmetic, and tail-call chains that create virtual memory beyond eBPF’s 512-byte stack limit. These designs emerge from and exploit the constraints rather than fighting them. AggBox achieves remarkable performance on commodity hardware: 84.5% reduction in broadcast latency, 43× speedup for MapReduce workloads, and 56.1% faster ML gradient aggregation, all without specialized NICs or dedicated cores. Beyond performance, our work demonstrates that constrained environments can drive fundamental innovation in protocol design, offering insights for future resource-limited and verified systems.

Cite as

Jianchang Su, Yifan Zhang, and Wei Zhang. In-Kernel Aggregation and Broadcast Acceleration for Distributed Communication. In 1st New Ideas in Networked Systems (NINeS 2026). Open Access Series in Informatics (OASIcs), Volume 139, pp. 13:1-13:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{su_et_al:OASIcs.NINeS.2026.13,
  author =	{Su, Jianchang and Zhang, Yifan and Zhang, Wei},
  title =	{{In-Kernel Aggregation and Broadcast Acceleration for Distributed Communication}},
  booktitle =	{1st New Ideas in Networked Systems (NINeS 2026)},
  pages =	{13:1--13:23},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-414-7},
  ISSN =	{2190-6807},
  year =	{2026},
  volume =	{139},
  editor =	{Argyraki, Katerina and Panda, Aurojit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.NINeS.2026.13},
  URN =		{urn:nbn:de:0030-drops-255981},
  doi =		{10.4230/OASIcs.NINeS.2026.13},
  annote =	{Keywords: eBPF, distributed communication, broadcast, aggregation, in-kernel processing, XDP}
}
Document
Brief Announcement
Brief Announcement: Reaching Approximate Consensus When Everyone May Crash

Authors: Lewis Tseng, Qinzi Zhang, and Yifan Zhang

Published in: LIPIcs, Volume 179, 34th International Symposium on Distributed Computing (DISC 2020)


Abstract
Fault-tolerant consensus is of great importance in distributed systems. This paper studies the asynchronous approximate consensus problem in the crash-recovery model with fair-loss links. In our model, up to f nodes may crash forever, while the rest may crash intermittently. Each node is equipped with a limited-size persistent storage that does not lose data when crashed. We present an algorithm that only stores three values in persistent storage - state, phase index, and a counter.

Cite as

Lewis Tseng, Qinzi Zhang, and Yifan Zhang. Brief Announcement: Reaching Approximate Consensus When Everyone May Crash. In 34th International Symposium on Distributed Computing (DISC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 179, pp. 53:1-53:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{tseng_et_al:LIPIcs.DISC.2020.53,
  author =	{Tseng, Lewis and Zhang, Qinzi and Zhang, Yifan},
  title =	{{Brief Announcement: Reaching Approximate Consensus When Everyone May Crash}},
  booktitle =	{34th International Symposium on Distributed Computing (DISC 2020)},
  pages =	{53:1--53:3},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-168-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{179},
  editor =	{Attiya, Hagit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2020.53},
  URN =		{urn:nbn:de:0030-drops-131319},
  doi =		{10.4230/LIPIcs.DISC.2020.53},
  annote =	{Keywords: Approximate Consensus, Fair-loss Channel, Crash-recovery}
}
Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail