
In this paper, a general framework for 3D convolutional neural networks is proposed. In this framework, five kinds of layers including convolutional layer, max-pooling layer, dropout layer, Gabor layer and optical flow layer are defined. General rules of designing 3D convolutional neural networks are discussed. Four specific networks are designed for facial expression recognition. Decisions of the four networks are fused together. The single networks and the ensemble network are evaluated on the Extended Cohn-Kanade dataset and achieve accuracies of 92.31 and 96.15%. The ensemble network obtains an accuracy of 61.11% on the FEEDTUM dataset. A reusable open-source project called 4DCNN is released. Based on this project, implementing 3D convolutional neural networks for specific tasks will be convenient.