Iteration acceleration for distributed learning systems

作者:Wang, Junxiong; Wang, Hongzhi*; Zhao, Chenxu; Li, Jianzhong; Gao, Hong
来源:Parallel Computing, 2018, 72: 29-41.
DOI:10.1016/j.parco.2018.01.001

摘要

During the implementation of iterative machine learning algorithms for objective function optimization in large-scale distributed environment, they can be blocked when some of the machines failed. In this paper, a hybrid approach is proposed to balance the performance and efficiency. In each iteration, the results from failure machines are abandoned. We will discuss the relationship between accuracy and abandon rate which can be formulated as inequations. It is argued that the speed of this process is highly effective and efficient. The algorithm is demonstrated using real distributed environment and shows significant improvement of speedup results.