Decimal SRT Square Root: Algorithm and Architecture

作者:Kaivani Amir; Ko Seok Bum*
来源:Circuits, Systems, and Signal Processing, 2013, 32(5): 2137-2150.
DOI:10.1007/s00034-013-9586-3

摘要

Given the popularity of decimal arithmetic, hardware implementation of decimal operations has been a hot topic of research in recent decades. Besides the four basic operations, the square root can be implemented as an instruction directly in the hardware, which improves the performance of the decimal floating-point unit in the processors. Hardware implementation of decimal square rooters is usually done using either functional or digit-recurrence algorithms. Functional algorithms, entailing multiplication per iteration, seem inadequate to use for decimal square roots, given the high cost of decimal multipliers. On the other hand, digit-recurrence square root algorithms, particularly SRT (this method is named after its creators, Sweeney, Robertson, and Tocher) algorithms, are simple and well suited for decimal arithmetic. This paper, with the intention of reducing the latency of the decimal square root operation while maintaining a reasonable cost, proposes an SRT algorithm and the corresponding hardware architecture to compute the decimal square root. The proposed fixed-point square root design requires n+3 cycles to compute an n-digit root; the synthesis results show an area cost of about 31K NAND2 and a cycle time of 40 FO4. These results reveal the 14 % speed advantage of the proposed decimal square root architecture over the fastest previous work (which uses a functional algorithm) with about a quarter of the area.

  • 出版日期2013-10