ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning

Zhixiong Yang, Waheed U. Bajwa

Research output: Contribution to journalArticle

4 Scopus citations

Abstract

Distributed machine learning algorithms enable learning of models from datasets that are distributed over a network without gathering the data at a centralized location. While efficient distributed algorithms have been developed under the assumption of faultless networks, failures that can render these algorithms nonfunctional occur frequently in the real world. This paper focuses on the problem of Byzantine failures, which are the hardest to safeguard against in distributed algorithms. While Byzantine fault tolerance has a rich history, existing work does not translate into efficient and practical algorithms for high-dimensional learning in fully distributed (also known as decentralized) settings. In this paper, an algorithm termed Byzantine-resilient distributed coordinate descent is developed and analyzed that enables distributed learning in the presence of Byzantine failures. Theoretical analysis (convex settings) and numerical experiments (convex and nonconvex settings) highlight its usefulness for high-dimensional distributed learning in the presence of Byzantine failures.

Original languageEnglish (US)
Article number8759887
Pages (from-to)611-627
Number of pages17
JournalIEEE Transactions on Signal and Information Processing over Networks
Volume5
Issue number4
DOIs
StatePublished - Dec 2019

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Information Systems
  • Computer Networks and Communications

Keywords

  • Byzantine failure
  • consensus
  • coordinate descent
  • decentralized learning
  • distributed optimization
  • empirical risk minimization
  • machine learning

Fingerprint Dive into the research topics of 'ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning'. Together they form a unique fingerprint.

  • Cite this