Fully asynchronous distributed optimization with linear convergence over directed networks

SHA Xingyu; ZHANG Jiaqi; YOU Keyou

doi:10.13471/j.cnki.acta.snus.2023A023

您当前的位置：

首页 >

文章列表页 >

Fully asynchronous distributed optimization with linear convergence over directed networks

优化技术专栏 | 更新时间：2023-11-07

- Fully asynchronous distributed optimization with linear convergence over directed networks
- Fully asynchronous distributed optimization with linear convergence over directed networks
- 中山大学学报(自然科学版)(中英文) 2023年62卷第5期页码：1-23
- 作者机构：
  
  1. Department of Automation, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
  2. Beijing Sankuai Online Technology Co., Ltd., Beijing 100102, China
- 作者简介：
  
  SHA Xingyu（shaxy18@mails.tsinghua.edu.cn）；
  ZHANG Jiaqi（jqzhang61@gmail.com）
  YOU Keyou（youky@tsinghua.edu.cn）；
- 基金信息：
  
  National Natural Science Foundation of China(62033006;62203254)
- DOI：10.13471/j.cnki.acta.snus.2023A023
  中图分类号： O221.2
- 纸质出版日期：2023-09-25，
  
  网络出版日期：2023-08-30，
  
  收稿日期：2023-03-31，
  
  录用日期：2023-04-11
扫描看全文
SHA Xingyu, ZHANG Jiaqi, YOU Keyou. Fully asynchronous distributed optimization with linear convergence over directed networks[J]. 中山大学学报(自然科学版)(中英文), 2023,62(5):1-23.

SHA Xingyu,ZHANG Jiaqi,YOU Keyou.Fully asynchronous distributed optimization with linear convergence over directed networks[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2023,62(05):1-23.
SHA Xingyu, ZHANG Jiaqi, YOU Keyou. Fully asynchronous distributed optimization with linear convergence over directed networks[J]. 中山大学学报(自然科学版)(中英文), 2023,62(5):1-23. DOI： 10.13471/j.cnki.acta.snus.2023A023.

SHA Xingyu,ZHANG Jiaqi,YOU Keyou.Fully asynchronous distributed optimization with linear convergence over directed networks[J].Acta Scientiarum Naturalium Universitatis Sunyatseni,2023,62(05):1-23. DOI： 10.13471/j.cnki.acta.snus.2023A023.

摘要

Abstract

We study distributed optimization problems over a directed network

where nodes aim to minimize the sum of local objective functions via directed communications with neighbors. Many algorithms are designed to solve it for synchronized or randomly activated implementation

which may create deadlocks in practice. In sharp contrast

we propose a fully asynchronous push-pull gradient(APPG) algorithm

where each node updates without waiting for any other node by using possibly delayed information from neighbors. Then

we construct two novel augmented networks to analyze asynchrony and delays

and quantify its convergence rate from the worst-case point of view. Particularly

all nodes of APPG converge to the same optimal solution at a linear rate of

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662083&type=

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662072&type=

9.73666668

4.31799984

if local functions have Lipschitz-continuous gradients and their sum satisfies the Polyak-Łojasiewicz condition (convexity is not required)

where

<math id="M2"><mi>λ</mi><mo>∈</mo><mfenced separators="|"><mrow><mn mathvariant="normal">0

1</mn></mrow></mfenced></math>

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662094&type=

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662084&type=

13.88533401

4.14866638

is explicitly given and the virtual counter

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662103&type=

https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=49662086&type=

1.69333339

2.62466669

increases by one when any node updates. Finally

the advantage of APPG over the synchronous counterpart and its linear speedup efficiency are numerically validated via a logistic regression problem.

关键词

Keywords

fully asynchronousdistributed optimizationlinear convergencePolyak-Łojasiewicz condition

references

ASSRAN M S， RABBAT M G， 2021. Asynchronous gradient push［J］. IEEE Trans Autom Control， 66（1）： 168-183.

ASSRAN M， LOIZOU N， BALLAS N， et al， 2019. Stochastic gradient push for distributed deep learning［C］//Proc Int Conf Mach Learn， 97： 344-353.

ASSRAN M， AYTEKIN A， FEYZMAHDAVIAN H R， et al， 2020. Advances in asynchronous parallel and distributed optimization［J］. Proc IEEE， 108（11）： 2013-2031.

CHANG T H， HONG M Y， WANG X F， 2015. Multi-agent distributed optimization via inexact consensus ADMM［J］. IEEE Trans Signal Process， 63（2）： 482-497.

CHANG T H， HONG M Y， LIAO W C， et al， 2016. Asynchronous distributed ADMM for large-scale optimization—Part I： Algorithm and convergence analysis［J］. IEEE Trans Signal Process， 64（12）： 3118-3130.

DHEERU D， KARRA TANISKIDOU E， 2017. UCI machine learning repository［EB/OL］. University of California， Irvine， School of Information and Computer Sciences. http：//archive.ics.uci.edu/mlhttp://archive.ics.uci.edu/ml.

FAZEL M， GE R， KAKADE S M， et al， 2018. Global convergence of policy gradient methods for the linear quadratic regulator［C］//Proc Int Conf Mach Learn， 80： 1467-1476.

JAKOVETIĆ D， 2019. A unification and generalization of exact distributed first-order methods［J］. IEEE Trans Signal Inf Process Netw， 5（1）： 31-46.

KARIMI H， NUTINI J， SCHMIDT M， 2016. Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition［C］//Joint Eur Conf Mach Learn Knowl Discovery Databases， 795-811.

LI H Q， HUANG C C， WANG Z， et al， 2020. Computation-efficient distributed algorithm for convex optimization over time-varying networks with limited bandwidth communication［J］. IEEE Trans Signal Inf Process Netw， 6： 140-151.

LI S， BASAR T， 1987. Asymptotic agreement and convergence of asynchronous stochastic algorithms［J］. IEEE Trans Autom Control， 32（7）： 612-618.

LI Z， SHI W， YAN M， 2019. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates［J］. IEEE Trans Signal Process， 67（17）： 4494-4506.

LIAN X R， ZHANG W， ZHANG C， et al， 2018. Asynchronous decentralized parallel stochastic gradient descent［C］//Proc Int Conf Mach Learn， 80： 3043-3052.

NEDIĆ A， 2011. Asynchronous broadcast-based convex optimization over a network［J］. IEEE Trans Autom Control， 56（6）： 1337-1351.

NEDIĆ A， OLSHEVSKY A， 2015. Distributed optimization over time-varying directed graphs［J］. IEEE Trans Autom Control， 60（3）： 601-615.

NEDIĆ A， OLSHEVSKY A， 2016. Stochastic gradient-push for strongly convex functions on time-varying directed graphs［J］. IEEE Trans Autom Control， 61（12）： 3936-3947.

NEDIĆ A， OZDAGLAR A， 2009. Distributed subgradient methods for multi-agent optimization［J］. IEEE Trans Autom Control， 54（1）： 48-61.

NEDIĆ A， OZDAGLAR A， 2010. Convergence rate for consensus with delays［J］. J Glob Optim， 47（3）： 437-456.

NEDIĆ A， OLSHEVSKY A， SHI W， 2017. Achieving geometric convergence for distributed optimization over time-varying graphs［J］. SIAM J Optim， 27（4）： 2597-2633.

OLSHEVSKY A， TSITSIKLIS J N， 2011. Convergence speed in distributed consensus and averaging［J］. SIAM Rev， 53（4）： 747-772.

PU S， SHI W， XU J M， et al， 2021. Push-pull gradient methods for distributed optimization in networks［J］. IEEE Trans Autom Control， 66（1）： 1-16.

SAADATNIAKI F， XIN R， KHAN U A， 2020. Decentralized optimization over time-varying directed graphs with row and column-stochastic matrices［J］. IEEE Trans Autom Control， 65（11）： 4769-4780.

SCUTARI G， SUN Y， 2019. Distributed nonconvex constrained optimization over time-varying digraphs［J］. Math Program， 176（1/2）： 497-544.

SPIRIDONOFF A， OLSHEVSKY A， PASCHALIDIS I C， 2020. Robust asynchronous stochastic gradient-push： Asymptotically optimal and network-independent performance for strongly convex functions［J］. J Mach Learn Res， 21（58）： 1-47.

SUN H， HONG M Y， 2019. Distributed non-convex first-order optimization and information processing： Lower complexity bounds and rate optimal algorithms［J］. IEEE Trans Signal Process， 67（22）： 5912-5928.

TIAN Y， SUN Y， SCUTARI G， 2020. Achieving linear convergence in distributed asynchronous multi-agent optimization［J］. IEEE Trans Autom Control， 65（12）： 5264-5279.

TOURI B， 2012. Product of random stochastic matrices and distributed averaging［M］. Berlin： Springer-Verlag.

TSITSIKLIS J， BERTSEKAS D， ATHANS M， 1986. Distributed asynchronous deterministic and stochastic gradient optimization algorithms［J］. IEEE Trans Autom Control， 31（9）： 803-812.

WU T Y， YUAN K， LING Q， et al， 2018. Decentralized consensus optimization with asynchrony and delays［J］. IEEE Trans Signal Inf Process Netw， 4（2）： 293-307.

XIE P， YOU K Y， TEMPO R， et al， 2018. Distributed convex optimization with inequality constraints over time-varying unbalanced digraphs［J］. IEEE Trans Autom Control， 63（12）： 4331-4337.

XIN R， KHAN U A， 2018. A linear algorithm for optimization over directed graphs with geometric convergence［J］. IEEE Control Syst Lett， 2（3）： 315-320.

XIN R， KAR S， KHAN U A， 2020. Decentralized stochastic optimization and machine learning： A unified variance-reduction framework for robust performance and fast convergence［J］. IEEE Signal Process Mag， 37（3）： 102-113.

XU J M， ZHU S Y， SOH Y C， et al， 2018. Convergence of asynchronous distributed gradient methods over stochastic networks［J］. IEEE Trans Autom Control， 63（2）： 434-448.

XU J M, TIAN Y, SUN Y， et al， 2021. Distributed algorithms for composite optimization： Unified framework and convergence analysis［J］. IEEE Trans Signal Process， 69： 3555-3570.

YI X L, LI X X, XIE L H, et al， 2020. Distributed online convex optimization with time-varying coupled inequality constraints［J］. IEEE Trans Signal Process， 68： 731-746.

YUAN K, YING B C, ZHAO X C， et al， 2019. Exact diffusion for distributed optimization and learning—Part I： Algorithm development［J］. IEEE Trans Signal Process， 67（3）： 708-723.

ZHANG J Q, YOU K Y， 2020a. AsySPA： An exact asynchronous algorithm for convex optimization over digraphs［J］. IEEE Trans Autom Control， 65（6）： 2494-2509.

ZHANG J Q, YOU K Y, CAI K， 2020b. Distributed dual gradient tracking for resource allocation in unbalanced networks［J］. IEEE Trans Signal Process， 68： 2186-2198.

ZHAO X C, SAYED A H, 2015. Asynchronous adaptation and learning over networks—Part I： Modeling and stability analysis［J］. IEEE Trans Signal Process， 63（4）： 811-826.

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据