Efficient on-line learning with diagonal approximation of loss function Hessian
AbstractThe subject of this paper is stochastic optimization as a tool for on-line learning. New ingredients are introduced to Nesterov’s Accelerated Gradient that increase efficiency of this algorithm and determine its parameters that are otherwise tuned manually: step-size and momentum decay factor. In this order a diagonal approximation of the Hessian of the loss function is estimated. In the experimental study the approach is applied to various types of neural networks, deep ones among others.
|Publication size in sheets||0.5|
|Book||2019 International Joint Conference on Neural Networks (IJCNN), 2019, Institute of Electrical and Electronics Engineers, ISBN 978-1-7281-2009-6, 250 p.|
|Keywords in English||on-line learning, accelerated gradient, parameter autotuning, deep learning|
|Score||= 140.0, 16-12-2019, ChapterFromConference|
|Publication indicators||= 0|
|Citation count*||1 (2020-02-08)|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.