摘要

GPU utilizes the wide cache-line (128B) on-chip cache to provide high bandwidth and efficient memory accesses for applications with regularly-organized data structures. However, emerging applications exhibit a lot of irregular control flows and memory access patterns. Irregular memory accesses generate many fine-grain memory accesses to L1 data cache. This mismatching between fine-grain data accesses and the coarse-grain cache design makes the on-chip memory space more constrained and as a result, the frequency of cache line replacement increases and Ll data cache is utilized inefficiently. Fine-grain cache management is proposed to provide efficient cache management to improve the efficiency of data array utilization. Unlike other static fine-grain cache managements, we propose a dynamic multi-grain cache management, called DyCache, to resolve the inefficient use of L1 data cache. Through monitoring the memory access pattern of applications, DyCache can dynamically alter the cache management granularity in order to improve the performance of GPU for applications with irregular memory accesses while not impact the performance for regular applications. Our experiment demonstrates that DyCache can achieve a 40% geometric mean improvement on IPC for applications with irregular memory accesses against the baseline cache (128B), while for applications with regular memory accesses, DyCache does not degrade the performance.