摘要

This paper proposes a VLSI architecture of the optimized Speeded-Up Robust Feature (SURF) algorithm. The SURF algorithm which is widely used in computer vision applications, locates interest points (IPoints) and extracts feature descriptors based on the surrounding gradients. In the proposed approach, the SURF algorithm is modified to make it more suitable for hardware implementation with little accuracy lost compared to OpenSURF, an open source software implementation based on OpenCV. The resource cost and throughput gain are balanced. Word Length Reduction (WLR) is adopted to compress the integral image and reduce the occupied memory resources. The orientations and the feature descriptors of the IPoints are calculated in a more efficient way, which significantly reduces memory accesses. Moreover, the operations are pipelined both in and among the proposed hardware modules. This VLSI architecture has been validated on FPGA (Xilinx Virtex-4 XC4VLX80), which is able to detect IPoints and extract SURF feature descriptors in a VGA (640 x 480) 64 fps video input with a 96 MHz working frequency while dissipating 1.278 W. This throughput is more than double of the ones reported in the latest literatures.

全文