摘要

This paper presents an asynchronous instruction cache memory for average-case performance, rather than worst-case performance. Even though the proposed instruction cache design is based on a fixed delay model, it can achieve high throughput by employing a new memory segmentation technique that divides cache memory cell arrays into multiple memory segments. The conventional bit-line memory segmentation divides a whole memory system into multiple segments so that all memory segments have the same size. On the contrary, we propose a new bit-line segmentation technique for the cache memory which consists of multiple segments but all the memory segments have the same delay bound for themselves. We use the resister-capacitor (R-C) modeling of bit-line delay for content addressable memory-random access memory (CAM-RAM) structure in a cache in order to estimate the total bit-line delay. Then, we decide the number of segments to trade-off between the throughput and complexity of a cache system. We synthesized a 128 KB cache memory consisting of various segments from 1 to 16 using Hynix 0.35-mu m CMOS process. From the simulation results, our implementation with dividing factor 4 and 16 can reduce the average cache access time to 28% and 35% when compared to the non-segmented counterpart system. It also shows that our implementation can reduce the average cache access time by 11% and 17% when compared to the bit-line segmented cache that consists of the same number of segments that have the same size.

  • 出版日期2014-5