摘要

In multiclass classification, convolutional neural network (CNN) is generally coupled with the cross-entropy (CE) loss, which only penalizes the predicted probability corresponding to a ground truth class and ignores the interclass relationship. We argue that CNN can be improved by using a better loss function. On the other hand, the Wasserstein distance (WD) is a well-known metric used to measure the distance between two distributions. Directly solving the WD problem requires a prohibitively large amount of computation time, whereas the cheaper iterative algorithms have a variety of shortcomings such as computational instability and difficulty in selecting parameters. In this paper, we address these issues by giving an analytical solution to the WD problem-for the first time, we find that for two distributions in hierarchically organized data space, WD has a closed-form solution, which we call "hierarchical WD (HWD)." We use this theory to construct novel loss functions that overcome the shortcomings of CE loss. To this end, multi-CNN information fusion that provides the basis for building category hierarchies is carried out first. Then, the semantic relationship among classes is modeled as a binary tree. Then, CNN coupled with an HWD-based loss, i.e., hierarchical Wasserstein CNN (HW-CNN), is trained to learn deep features. In this way, prior knowledge about the interclass relationship is embedded into HW-CNN, and information from several CNNs provides guidance in the process of training individual HW-CNNs. We conducted extensive experiments over two publicly available remote sensing data sets and achieved a state-of-the-art performance in scene classification tasks.

  • 出版日期2019-5