摘要
Providing high level tools for parallel programming while sustaining a high level of performance has been a challenge that techniques like Domain Specific Embedded Languages try to solve. In previous works, we investigated the design of such a DSEL-NT-providing a Matlab -like syntax for parallel numerical computations inside a C++ library. In this paper, we show how NT has been redesigned for shared memory systems in an extensible and portable way. The new NT design relies on a tiered Parallel Skeleton system built using asynchronous task management and automatic compile-time taskification of user level code. We describe how this system can operate various shared memory runtimes and evaluate the design by using two benchmarks implementing linear algebra algorithms.
- 出版日期2016-6
- 单位INRIA