摘要

To achieve high-performance on multicore systems, shared-memory parallel languages must efficiently implement atomic operations. The commonly used and studied paradigms for atomicity are fine-grained locking, which is both difficult to program and error-prone; optimistic software transactions, which require substantial overhead to detect and recover from atomicity violations; and compiler-generation of locks from programmer-specified atomic sections, which leads to serialization whenever imprecise pointer analysis suggests the mere possibility of a conflicting operation. This paper presents a new strategy for compiler-generated locking that uses data structure knowledge to facilitate more precise alias and lock generation analyses and reduce unnecessary serialization. Implementing and evaluating these ideas in the Java language shows that the new strategy achieves eight-thread speedups of 0.83 to 5.9 for the five STAMP benchmarks studied, outperforming software transactions on all but one benchmark, and nearly matching programmer-specified fine-grained locks on all but one benchmark. The results also indicate that compiler knowledge of data structures improves the effectiveness of compiler analysis, boosting eight-thread performance by up to 300%. Further, the new analysis allows for software support of strong atomicity with less than 1% overhead for two benchmarks and less than 20% for three others. The strategy also nearly matches the performance of programmer-specified fine-grained locks for the SPECjbb2000 benchmark, which has traditionally not been amenable to static analyses.

  • 出版日期2010-5