摘要

As many emerging applications use FPGAs for acceleration (e.g. deep learning, data mining), designing highly-optimized application-specific soft processors on FPGAs gets much attention. Cache is an important component of the soft processor, which is built from Block-RAMs (BRAMs) in FPGAs. SRAM based BRAMs suffer from high static power consumption and area penalty, which prevents implementing large caches with high associativity. STT-RAM based BRAM may be a good solution to these issues. However, existing cache design with SRAM-based BRAMs for soft processors or SRAM and STT-RAM hybrid cache design in conventional processors is not suitable for the cache with STT-RAM based BRAMs. In this paper, we propose a BRAM allocation method that can effectively implement highly set-associative caches whereas reducing the impact of long delays and power consumption of write operations in STT-RAM. Using our framework, we show that the optimal size of STT-RAM based BRAM is 1KB with 64-bit IO width for soft-processor cache and the proposed cache structure reduces power and area on average by 55.3% and 76.9%, and reduces runtime by up to 15.6%. Supporting diverse sizes and associativity enables application specific optimization of a cache. In addition, we show that a hybrid cache with SRAM and STT-RAM is not recommended for the soft processor.

  • 出版日期2018-5-25