Autonomous Orchestration of Distributed Discrete Event Simulations in the Presence of Resource Uncertainty

Sui Zhiquan<sup>*</sup>; Malensek Matthew; Harvey Neil; Pallickara Shrideep

doi:10.1145/2746345

摘要

Discrete event simulations model the behavior of complex, real-world systems. Simulating a wide range of events and conditions provides a more nuanced model, but also increases its computational footprint. To manage these processing requirements in a scalable manner, discrete event simulations can be distributed across multiple computing resources. Orchestrating the simulations in a distributed setting involves coping with resource uncertainty. We consider three key aspects of resource uncertainty: resource failures, heterogeneity, and slowdowns. Each of these aspects is managed autonomously, which involves making accurate predictions of future execution times and latencies while also accounting for differences in hardware capabilities and dynamic resource consumption profiles. Further complicating matters, individual tasks within the simulation are stateful and stochastic, requiring inter-task communication and synchronization to produce accurate outcomes. We deal with these challenges through intelligent state collection and migration, active resource monitoring, and empirical evaluation of resource capabilities under changing conditions. To underscore the viability of our solution, we provide benchmarks using a production discrete event simulation that can simultaneously sustain failures, manage resource heterogeneity, and handle slowdowns while being orchestrated by our framework.

出版日期2015-10

全文

访问全文

收藏分享被引(3) 浏览

更新时间：2021-04-09 20:14

Autonomous Orchestration of Distributed Discrete Event Simulations in the Presence of Resource Uncertainty

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友