An adaptive mechanism to achieve learning rate dynamically

Zhang, Jinjing; Hu, Fei; Li, Li<sup>*</sup>; Xu, Xiaofei; Yang, Zhanbo; Chen, Yanbin

doi:10.1007/s00521-018-3495-0

摘要

Gradient descent is prevalent for large-scale optimization problems in machine learning; especially it nowadays plays a major role in computing and correcting the connection strength of neural networks in deep learning. However, many gradient-based optimization methods contain more sensitive hyper-parameters which require endless ways of configuring. In this paper, we present a novel adaptive mechanism called adaptive exponential decay rate (AEDR). AEDR uses an adaptive exponential decay rate rather than a fixed and preconfigured one, and it can allow us to eliminate one otherwise tuning sensitive hyper-parameters. AEDR also can be used to calculate exponential decay rate adaptively by employing the moving average of both gradients and squared gradients over time. The mechanism is then applied to Adadelta and Adam; it reduces the number of hyper-parameters of Adadelta and Adam to only a single one to be turned. We use neural network of long short-term memory and LeNet to demonstrate how learning rate adapts dynamically. We show promising results compared with other state-of-the-art methods on four data sets, the IMDB (movie reviews), SemEval-2016 (sentiment analysis in twitter) (IMDB), CIFAR-10 and Pascal VOC-2012.

出版日期2019-10
单位重庆大学; 重庆第二师范学院; 西南大学

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2021-11-24 11:25

An adaptive mechanism to achieve learning rate dynamically

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友