Iterative Data-parallel Mark&amp;Sweep on a GPU

Veldema Ronald<sup>*</sup>; Philippsen Michael

doi:10.1145/2076022.1993480

摘要

Automatic memory management makes programming easier. This is also true for general purpose GPU computing where currently no garbage collectors exist. In this paper we present a parallel mark-and-sweep collector to collect GPU memory on the GPU and tune its performance. Performance is increased by: (1) data-parallel marking and sweeping of regions of memory, (2) marking all elements of large arrays in parallel, (3) trading recursion over parallelism to match deeply linked data structures. (1) is achieved by coarsely processing all potential objects in a region of memory in parallel. When during (1) a large array is detected, it is put aside and a parallel-for is later issued on the GPU to mark its elements. For a data-structure that is a large linked list, we dynamically switch to a marking version with less overhead by performing a few recursive steps sequentially (and multiple lists in parallel). The collector achieves a speedup of a factor of up-to 11 over a sequential collector on the same GPU.

出版日期2011-11

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2024-03-27 10:13

Iterative Data-parallel Mark&Sweep on a GPU

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友