Accelerating data mining workloads: current approaches and future challenges in system architecture design

作者:Choudhary Alok N*; Honbo Daniel; Kumar Prabhat; Ozisikyilmaz Berkin; Misra Sanchit; Memik Gokhan
来源:Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery, 2011, 1(1): 41-54.
DOI:10.1002/widm.9

摘要

Conventional systems based on general-purpose processors cannot keep pace with the exponential increase in the generation and collection of data. It is therefore important to explore alternative architectures that can provide the computational capabilities required to analyze ever-growing datasets. Programmable graphics processing units (GPUs) offer computational capabilities that surpass even high-end multi-core central processing units (CPUs), making them well-suited for floating-point-or integer-intensive and data parallel operations. Field-programmable gate arrays (FPGAs), which can be reconfigured to implement an arbitrary circuit, provide the capability to specify a customized datapath for any task. The multiple granularities of parallelism offered by FPGA architectures, as well as their high internal bandwidth, make them suitable for low complexity parallel computations. GPUs and FPGAs can serve as coprocessors for data mining applications, allowing the CPU to offload computationally intensive tasks for faster processing. Experiments have shown that heterogeneous architectures employing GPUs or FPGAs can result in significant application speedups over homogenous CPU-based systems, while increasing performance per watt.