Multi-domain job coscheduling for leadership computing systems

作者:Tang Wei*; Desai Narayan; Vishwanath Venkatram; Buettner Daniel; Lan Zhiling
来源:Journal of Supercomputing, 2013, 63(2): 367-384.
DOI:10.1007/s11227-012-0741-6

摘要

Current supercomputing centers usually deploy a large-scale compute system together with an associated data analysis or visualization system. Multiple scenarios have driven the demand that some associated jobs co-execute on different machines. We propose a multi-domain coscheduling mechanism, providing the ability to coordinate execution between jobs on multiple resource management domains without manual intervention. We have evaluated our mechanism based on real job traces from Intrepid and Eureka, the production Blue Gene/P system and a cluster with the largest GPU installation, deployed at Argonne National Laboratory. The experimental results show that coscheduling can be achieved with limited impact on system performance under varying workloads.

  • 出版日期2013-2