A Hybrid Architecture With Low Latency Interfaces Enabling Dynamic Cache Management

Gemieux, Michel; Li, Meng<sup>*</sup>; Savaria, Yvon; David, Jean-Pierre; Zhu, Guchuan

doi:10.1109/ACCESS.2018.2876597

摘要

The main focus of the dominant technologies in the high performance computation (HPC) market, such as GPU and multicore systems, is put on processing power, while much less attention has been paid to communication delays inside hybrid architectures. To fill this gap, this paper presents an experimental study on Intel's Broadwell Xeon multicore processor with integrated Arria 10 FPGA capabilities to characterize the communication delays between CPUs and the FPGA, using both the low latency cache coherent interface and the two PCIe links offered by this platform. The obtained results show that an FPGA cache access latency can be as low as 25 cycles at 400 MHz and that the platform is capable of reaching a bandwidth over 20 GB/s using an aggregate of the three available links. Furthermore, an FPGA-based cache management mechanism is proposed and implemented in this paper. A case study on a Merkle tree hash function shows that a hardware accelerator can achieve a fivefold data access acceleration in the worst case scenario. This scheme takes advantage of the QPI cache coherency and queuing theory to achieve a low latency and efficient memory management. In addition, design recommendations regarding the use of the CPU-FPGA platform for the implementation of fine-grained memory management schemes are suggested.

出版日期2018
单位浙江理工大学

全文

访问全文

收藏分享被引浏览

更新时间：2021-09-17 18:21

A Hybrid Architecture With Low Latency Interfaces Enabling Dynamic Cache Management

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友