摘要

In this paper, we present a reconfigurable SIMT multi-core processor with a shared memory for mobile ray tracing. The proposed processor addresses two issues of SIMT architecture: branch divergence of concurrently executed threads and contention in a shared memory. Performance degradation due to the branch divergence is reduced by dividing a wide SIMT datapath into several narrow SIMT cores that execute independent threads asynchronously. The contention in a shared memory caused by the multiple SIMT cores is alleviated by introducing a new time-division multiplexing (TDM) scheme using multi-phase clocks. The SIMT cores send their requests to a shared memory sequentially not concurrently by synchronizing the SIMT cores with multi-phase clocks to hide arbitration delays. The processor achieves the same datapath utilization as 4-wide SIMT which has been widely used by CPU-based ray tracers while its area remains 68% of the 4-wide SIMT. As a result, the performance normalized to area is improved by 26% compared to previous work with negligible overheads (2.6% for area and 1% for power consumption). The chip was fabricated in 90 nm CMOS technology, and it contains 2.3 M logic gates and 19.3 KB SRAM. It consumes 221 mW at 100 MHz with Vdd = 1.2V.

  • 出版日期2013-4

全文