摘要

Due to the sustained and rapid growth of big data and the demand on higher accuracy solutions for application problems, the completion time of fixed-time big data tasks executing on original parallel computing systems becomes longer and longer. To meet the requirement of fixed completion time, the original parallel computing systems need to be scaled accordingly. Therefore, this paper studies an iso-time scaling method to guide the scaling of parallel computing systems. Firstly, the models of big data parallel tasks and parallel computing systems are built, and an algorithm is designed to calculate the completion time of big data parallel tasks. Secondly, according to the actual situation of the current majority computing centers, we put forward some reasonable hypotheses, make full use of backup computational nodes, and optimize the cost of scaling parallel computing systems. Then, a vertical scaling algorithm is designed to upgrade computational nodes, and a horizontal scaling algorithm is designed to add computational nodes. Furthermore, this paper compares the two scaling algorithms in the aspects of time complexity, degree of parallelism and system utilization for scaled parallel computing system. Finally, some simulation experiments are conducted. The experimental results show that our method can keep the completion time within fixed time when the increasing data parallel tasks execute on the scaled parallel computing systems and it has better effect in scaling cost than traditional methods.