An Efficient GPU Implementation of Inclusion-Based Pointer Analysis

作者:Su, Yu*; Ye, Ding*; Xue, Jingling*; Liao, Xiang Ke*
来源:IEEE Transactions on Parallel and Distributed Systems, 2016, 27(2): 353-366.
DOI:10.1109/TPDS.2015.2397933

摘要

We present an efficient GPU implementation of Andersen's whole-program inclusion-based pointer analysis, a fundamental analysis on which many others are based, including optimising compilers, bug detection and security analyses. Andersen's algorithm makes extensive modifications to the graph that represents the pointer-manipulating statements in a program. These modifications are highly irregular, input-dependent and statically unpredictable, making it much more challenging to balance such graph workloads across a multitude of GPU cores than those dealt with by traditional graph algorithms such as DFS and BFS. To parallelise Andersen's analysis efficiently on GPUs, we introduce an imbalance-aware workload partitioning scheme that divides its workload dynamically among the concurrent warps, initially in a warp-centric manner (during the coarsegrain stage) but later switches to a task-pool-based model when a workload imbalance is detected (during the fine-grain stage). We improve further its performance by using an adaptive group propagation scheme to reduce some redundant traversals. For a set of 14 C benchmarks evaluated, our parallel implementation of Andersen's analysis achieves a significant speedup of 46 percent on average over the state-of-the art on an NVIDIA Tesla K20c GPU.