Approximation algorithms and heuristics for task scheduling in data-intensive distributed systems

Povoa Marcelo G<sup>*</sup>; Xavier Eduardo C

doi:10.1111/itor.12527

摘要

In this work, we are interested in the problem of task scheduling on large-scale data-intensive computing systems. In order to achieve good performance, one must construct not only good task schedules but also good data allocation across nodes on the system, since before a task can be executed, it must have access to data distributed on the system. In this article, we present a general formulation of a static problem that combines both scheduling and replication problems in data-intensive distributed systems. We show that this problem does not admit an approximation algorithm. However, considering a restricted version of the problem that considers some practical constraints, an approximation algorithm can be designed. From a practical perspective, we introduce a novel heuristic for the problem that is based on nodes clustering. We compare the heuristic with two adapted approaches from other works in the literature by computational simulations using an extensive set of instances based on real computer grids. We show that our heuristic often obtains the best solutions and also runs faster than other approaches.

出版日期2018-9

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2024-04-06 21:48

Approximation algorithms and heuristics for task scheduling in data-intensive distributed systems

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友