1 Search Results for "Boussetta, Amine"


Document
AKSEL: Fast Byzantine SGD

Authors: Amine Boussetta, El-Mahdi El-Mhamdi, Rachid Guerraoui, Alexandre Maurer, and Sébastien Rouault

Published in: LIPIcs, Volume 184, 24th International Conference on Principles of Distributed Systems (OPODIS 2020)


Abstract
Modern machine learning architectures distinguish servers and workers. Typically, a d-dimensional model is hosted by a server and trained by n workers, using a distributed stochastic gradient descent (SGD) optimization scheme. At each SGD step, the goal is to estimate the gradient of a cost function. The simplest way to do this is to average the gradients estimated by the workers. However, averaging is not resilient to even one single Byzantine failure of a worker. Many alternative gradient aggregation rules (GARs) have recently been proposed to tolerate a maximum number f of Byzantine workers. These GARs differ according to (1) the complexity of their computation time, (2) the maximal number of Byzantine workers despite which convergence can still be ensured (breakdown point), and (3) their accuracy, which can be captured by (3.1) their angular error, namely the angle with the true gradient, as well as (3.2) their ability to aggregate full gradients. In particular, many are not full gradients for they operate on each dimension separately, which results in a coordinate-wise blended gradient, leading to low accuracy in practical situations where the number (s) of workers that are actually Byzantine in an execution is small (s < < f). We propose Aksel, a new scalable median-based GAR with optimal time complexity (𝒪(nd)), optimal breakdown point (n > 2f) and the lowest upper bound on the expected angular error (𝒪(√d)) among full gradient approaches. We also study the actual angular error of Aksel when the gradient distribution is normal and show that it only grows in 𝒪(√dlog{n}), which is the first logarithmic upper bound ever proven on the number of workers n assuming an optimal breakdown point. We also report on an empirical evaluation of Aksel on various classification tasks, which we compare to alternative GARs against state-of-the-art attacks. Aksel is the only GAR reaching top accuracy when there is actually none or few Byzantine workers while maintaining a good defense even under the extreme case (s = f). For simplicity of presentation, we consider a scheme with a single server. However, as we explain in the paper, Aksel can also easily be adapted to multi-server architectures that tolerate the Byzantine behavior of a fraction of the servers.

Cite as

Amine Boussetta, El-Mahdi El-Mhamdi, Rachid Guerraoui, Alexandre Maurer, and Sébastien Rouault. AKSEL: Fast Byzantine SGD. In 24th International Conference on Principles of Distributed Systems (OPODIS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 184, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{boussetta_et_al:LIPIcs.OPODIS.2020.8,
  author =	{Boussetta, Amine and El-Mhamdi, El-Mahdi and Guerraoui, Rachid and Maurer, Alexandre and Rouault, S\'{e}bastien},
  title =	{{AKSEL: Fast Byzantine SGD}},
  booktitle =	{24th International Conference on Principles of Distributed Systems (OPODIS 2020)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-176-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{184},
  editor =	{Bramas, Quentin and Oshman, Rotem and Romano, Paolo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2020.8},
  URN =		{urn:nbn:de:0030-drops-134931},
  doi =		{10.4230/LIPIcs.OPODIS.2020.8},
  annote =	{Keywords: Machine learning, Stochastic gradient descent, Byzantine failures}
}
  • Refine by Author
  • 1 Boussetta, Amine
  • 1 El-Mhamdi, El-Mahdi
  • 1 Guerraoui, Rachid
  • 1 Maurer, Alexandre
  • 1 Rouault, Sébastien

  • Refine by Classification
  • 1 Computing methodologies → Batch learning
  • 1 Security and privacy → Distributed systems security
  • 1 Theory of computation → Nonconvex optimization

  • Refine by Keyword
  • 1 Byzantine failures
  • 1 Machine learning
  • 1 Stochastic gradient descent

  • Refine by Type
  • 1 document

  • Refine by Publication Year
  • 1 2021

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail