Understanding Mixup Training Methods

Liang, Daojun; Yang, Feng<sup>*</sup>; Zhang, Tian; Yang, Peter

doi:10.1109/ACCESS.2018.2872698

摘要

Mixup is a neural network training method that generates new samples by linear interpolation of multiple samples and their labels. The mixup training method has better generalization ability than the traditional empirical risk minimization method (ERM). But there is a lack of a more intuitive understanding of why mixup will perform better. In this paper, several different sample mixing methods are used to test how neural networks learn and infer from mixed samples to illustrate how mixups work as a data augmentation method and how it regularizes neural networks. Then, a method of weighting noise perturbation was designed to visualize the loss functions of mixup and ERM training methods to analyze the properties of their high-dimensional decision surfaces. Finally, by analyzing the mixture of samples and their labels, a spatial mixup approach was proposed that achieved the state-of-the-art performance on the CIFAR and ImageNet data sets. This method also enables the generative adversarial nets to have more stable training process and more diverse sample generation ability.

出版日期2018
单位山东师范大学

全文

访问全文

收藏分享被引(55) 浏览

更新时间：2024-05-10 21:00

Understanding Mixup Training Methods

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友