A 3.77TOPS/W Convolutional Neural Network Processor With Priority-Driven Kernel Optimization

作者:Yue, Jinshan; Liu, Yongpan*; Yuan, Zhe; Wang, Zhibo; Guo, Qingwei; Li, Jinyang; Yang, Chengmo; Yang, Huazhong
来源:IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66(2): 277-281.
DOI:10.1109/TCSII.2018.2846698

摘要

Convolutional neural network (CNN) has become very popular in image classification tasks. With the increasing demand on intelligent classification on battery-powered devices, energy-efficient ASICs for CNN are badly needed. While previous CNN ASIC processors support operations of different kernel sizes, they sacrifice efficiency to support flexible convolution operations. In fact, convolution operations with a certain kernel size are dominating in many real-case CNNs. This brief proposes a kernel-optimized architecture for 3 x 3 kernels (KOP3), which are dominating operations in mainstream image classification CNNs. Although KOP3 aims at 3 x 3 kernel operations, it also provides programmability to support arbitrary kernel sizes. KOP3 achieves average energy efficiency of 3.77TOPS/W, which is 4.01x better than the best state-of-the-art CNN ASIC processor.