Accelerating Data Analytics on Integrated GPU Platforms via Runtime Specialization

作者:Farooqui Naila; Roy Indrajit; Chen Yuan*; Talwar Vanish; Barik Rajkishore; Lewis Brian; Shpeisman Tatiana; Schwan Karsten
来源:International Journal of Parallel Programming, 2018, 46(2): 336-375.
DOI:10.1007/s10766-016-0482-x

摘要

Integrated GPU systems are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potential for fine-grained resource scheduling, there remain several open challenges: (1) the distinct execution models inherent in the heterogeneous devices present on such platforms drives the need to dynamically match workload characteristics to the underlying resources, (2) the complex architecture and programming models of such systems require substantial application knowledge to achieve high performance, and (3) as such systems become prevalent, there is a need to extend their utility from running known regular data-parallel applications to the broader set of input-dependent, irregular applications common in enterprise settings. The key contribution of our research is to enable runtime specialization on such integrated GPU platforms by matching application characteristics to the underlying heterogeneous resources for both regular and irregular workloads. Our approach enables profile-driven resource management and optimizations for such platforms, providing high application performance and system throughput. Toward this end, this work proposes two novel schedulers with distinct goals: (a) a device-affinity, contention-aware scheduler that incorporates instrumentation-driven optimizations to improve the throughput of running diverse applications on integrated CPU-GPU servers, and (b) a specialized, affinity-aware work-stealing scheduler that efficiently distributes work across all CPU and GPU cores for the same application, taking into account both application characteristics and architectural differences of the underlying devices.

  • 出版日期2018-4

全文