摘要

This paper proposes a unitable multi-core architecture, called hyperscalar, that can dynamically unite many scalar cores as a larger superscalar processor to accelerate a thread. To accomplish this, this paper proposes the virtual shared register files (VSRFs) that help the instructions of a thread in different cores can logically face a uniform set of register files. We also propose an instruction analyzer that can detect and tag the dependence information to the newly fetched instructions. With the tags, instructions in the united cores can issue requests to obtain their remote operands via the VSRF. Thus, the dependences arising among instructions in different cores can be resolved. Moreover, some extended instructions are defined for programmers to grow or shrink the number of united cores to match the available instruction level parallelism for different applications. The reconfigurable feature of hyperscalar covers a spectrum of workloads well, providing high single-thread performance when thread level parallelism (TLP) is low and high throughput when TLP is high. Simulation results show that the eight-core hyperscalar chip multiprocessor's two-, four-and eight-core-united configurations archive 93, 80 and 76% of the performance of the monolithic two-, four-and eight-issue out-of-order superscalar processors with lower area costs and better support for software diversity.