A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

Rosen Paul<sup>*</sup>

doi:10.1111/cgf.12103

摘要

We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel.

出版日期2013-6

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2019-03-28 06:45

A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友