A novel softplus linear unit for deep convolutional neural networks
Applied Intelligence, 2018, 48(7): 1707-1720.
Current improvements in the performance of deep neural networks are partly due to the proposition of rectified linear units. A ReLU activation function outputs zero for negative component, inducing the death of some neurons and a bias shift of the outputs, which causes oscillations and impedes learning. According to the theory that "zero mean activations improve learning ability", a softplus linear unit (SLU) is proposed as an adaptive activation function that can speed up learning and improve performance in deep convolutional neural networks. Firstly, for the reduction of the bias shift, negative inputs are processed using the softplus function, and a general form of the SLU function is proposed. Secondly, the parameters of the positive component are fixed to control vanishing gradients. Thirdly, the rules for updating the parameters of the negative component are established to meet back- propagation requirements. Finally, we designed deep auto-encoder networks and conducted several experiments with them on the MNIST dataset for unsupervised learning. For supervised learning, we designed deep convolutional neural networks and conducted several experiments with them on the CIFAR-10 dataset. The experiments have shown faster convergence and better performance for image classification of SLU-based networks compared with rectified activation functions.
Deep learning; Deep convolutional neural network; Softplus function; Rectified linear unit