摘要

A key challenge to program a chip multiprocessor (CMP) is how to evaluate the performance of various possible program-task-to-core mapping choices during the initial programming phase, when the executable program is yet to be developed. In this paper, we put forward a thread-level modeling methodology to meet this challenge. The idea is to model thread-level activities only and overlook the instruction-level and microarchitectural details, except those having significant impact on the thread-level performance. Moreover, since the thread-level modeling is much coarser than the instruction-level modeling, the analysis at this level turns out to be significantly faster than that at the instruction level. These features make the methodology particularly amenable for fast performance evaluation of a large number of program-task-to-core mapping choices during the initial programming phase. Based on this methodology, an analytic modeling technique based on queuing theory and a fast simulation tool are developed, both allowing for fast performance prediction of CMPs. Case studies based on a large number of code samples available in IXP1200/2400 workbenches demonstrate that the maximal sustainable line rates estimated using our simulation tool and queuing network models are consistently within 6 and 8 percent of cycle-accurate simulation results, respectively.

  • 出版日期2014-2