摘要

Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) are additional processor instruction sets available in contemporary personal computers, designed for vectorized floating point calculations. Unfortunately, in order to utilize the advantages of these instructions, one cannot rely on automatic options of high level language compilers. Instead, handwritten assembly language or intrinsic function call insertions are necessary. By using this idea an accelerated C++ code is devised, for solving (quasi-) block-tridiagonal linear algebraic equation systems by means of an extended Thomas algorithm. Speedups reaching 3.5 and 3 (relative to C++ without using SSE/AVX) are demonstrated for single and double precision calculations, respectively.

  • 出版日期2016-12