摘要

By rearranging the data, data layout optimizations improve the utilization of a cache line between two of its successive refills, thus reducing the total number of cache line refills and improving the performance of a program. In this paper, we show that to enable structure data layout optimizations to be effective, two parameters, namely intra-instance affinity and inter-instance affinity, need to be considered at the same time in order to model the cache line utilization more accurately. We also propose a lightweight approach to measure intra-instance affinity and inter-instance affinity to avoid complex memory trace analyses. A prototype, called ASLOP, has been implemented in the Open64 compiler and evaluated using benchmarks from SPEC CPU 2000, SPEC CPU 2006 and Olden benchmark suites that have extensive structure types. Our approach can achieve up to 48.1% performance improvement over the original programs, and 11.9% over the optimized programs using maximal reshaping, an existing approach that is known to produce close to the best results, on the two platforms we tested.