Accelerating Multiresolution Gabor Feature Extraction for Real Time Vision Applications

作者:Cho Yong Cheol Peter*; Chandramoorthy Nandhini; Irick Kevin M; Narayanan Vijaykrishnan
来源:Journal of Signal Processing Systems for Signal Image and Video Technology, 2014, 76(2): 149-168.
DOI:10.1007/s11265-014-0873-4

摘要

Multiresolution Gabor filter banks are used for feature extraction in a variety of applications as Gabor filters have shown to be exceptional feature extractors with a close correspondence to the simple cells in the primary visual cortex (V1) of the brain. Yet applying the Gabor filter is a computationally intensive task. Most applications that utilize the Gabor feature space require real time results; however, the large quantity of computations involved has hindered systems from achieving real time performance. The natural solution for such compute intensive tasks is parallelization. FPGAs have emerged as attractive platforms for compute intensive signal processing applications due to their massively parallel computation resources as well as low power consumption and affordability. We present a configurable architecture for Gabor feature extraction on FPGA that enhances the resource utilization of the FPGA hardware fabric while maintaining a streaming data flow to yield exceptional performance. The increased resource utilization resulting from configurability, optimizations, and resource sharing allows for higher levels of parallelism to achieve real time feature extraction of high resolution images. Two architectures are introduced. The first is an architecture for multiresolution feature extraction with extensive resource sharing for enhanced resource utilization. The second is an architecture for many-orientation applications using a coarse to fine grain method to enhance resource utilization by reducing the number of filters applied at different orientations. Our results show that our multiresolution implementation achieves real-time performance on 2048 x 1526 images and exhibits 6X speed up over a GPU implementation while exhibiting energy efficiency with 0.4fps/W compared to the GPU that achieves 0.036fps/W.[1] The implementation for many-orientation applications using the coarse to fine grain method exhibits resource saving of at most for O number of orientations and higher, compared to a fully parallel architecture and 25x speedup compared to a GPU implementation for 16 orientations.

  • 出版日期2014-8