高级检索

  • ISSN 1006-3080
  • CN 31-1691/TQ

带动量项的梯度下降算法的收敛性

彭先伦 谢纲

彭先伦, 谢纲. 带动量项的梯度下降算法的收敛性[J]. 华东理工大学学报(自然科学版). doi: 10.14135/j.cnki.1006-3080.20200326001
引用本文: 彭先伦, 谢纲. 带动量项的梯度下降算法的收敛性[J]. 华东理工大学学报(自然科学版). doi: 10.14135/j.cnki.1006-3080.20200326001
PENG Xianlun, XIE Gang. Convergence of Gradient Method with Momentum[J]. Journal of East China University of Science and Technology. doi: 10.14135/j.cnki.1006-3080.20200326001
Citation: PENG Xianlun, XIE Gang. Convergence of Gradient Method with Momentum[J]. Journal of East China University of Science and Technology. doi: 10.14135/j.cnki.1006-3080.20200326001

带动量项的梯度下降算法的收敛性

doi: 10.14135/j.cnki.1006-3080.20200326001
详细信息
    作者简介:

    彭先伦(1995-),男,江西人,硕士生,主要研究计算数学。E-mail:13122330812@163.com

    通讯作者:

    谢 纲 E-mail:rpi1004@ecust.edu.cn

  • 中图分类号: O29

Convergence of Gradient Method with Momentum

  • 摘要: 基于三层前馈神经网络的带动量项的反向传播算法进行了理论分析,学习率设置为常数,动量系数设置为一个适应性的变量来加速和稳定网络参数的训练过程。该模型在经过研究分析后得到了相应的收敛性结果并给出了详细的证明。相比于目前已有的结果,文中的结论更具有一般性。

     

  • [1] 赵澜涛, 林家俊. 基于双路CNN的多姿态人脸识别方法[J]. 华东理工大学学报(自然科学版), 2019, 45(3): 466-470.
    [2] 魏琛, 陈兰岚, 张傲. 基于集成神经网络的脑电情感识别[J]. 华东理工大学学报(自然科学版), 2019, 45(4): 612-644.
    [3] FINE T L, MUKHERJEE S. Parameter convergence and learning curves for neural networks[J]. Neural Computation, 1999, 11: 747-769. doi: 10.1162/089976699300016647
    [4] FINNOFF W. Diffusion approximations for the constant learning rate backpropagation algorithm and resistance to local minima[J]. Neural Computation, 1994, 6: 285-295. doi: 10.1162/neco.1994.6.2.285
    [5] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323: 533-536. doi: 10.1038/323533a0
    [6] CREMA A, LORETO M, RAYDAN M. Spectral projected subgradient with a momentum term for the Lagrangean dual approach[J]. Computers and Operations Research, 2007, 34(10): 3174-3186. doi: 10.1016/j.cor.2005.11.024
    [7] ISTOOK E, MARTINEZ T. Improved back-propagation learning in neural networks with windowed momentum[J]. International Journal of Neural System, 2002, 12: 303-318. doi: 10.1142/S0129065702001114
    [8] ZWEIRI H, SENEVIRATNE L D, ALTHOEFER K. Stability analysis of a three-term backpropagation algorithm[J]. Neural Networks, 2005, 18: 1341-1347. doi: 10.1016/j.neunet.2005.04.007
    [9] PHANSALKAR V V, SASTRY P S. Analysis of the back- propagation algorithm with back-propagation algorithm with momentum[J]. IEEE T Neural Networks, 1994, 5(3): 505-506. doi: 10.1109/72.286925
    [10] QIAN N. On the momentum term in gradient descent learning algorithms[J]. Neural Networks, 1999, 12: 145-151. doi: 10.1016/S0893-6080(98)00116-6
    [11] BHAYA A, KASZKUREWICZ E. Steepest descent with momentum for quadratic functions is aversion of the conjugate gradient method[J]. Neural Networks, 2004, 17: 65-71. doi: 10.1016/S0893-6080(03)00170-9
    [12] TORII M, HAGAN M T. Stability of steepest descent with momentum for quadratic functions[J]. IEEE T Neural Networks, 2002, 13(3): 752-756. doi: 10.1109/TNN.2002.1000143
    [13] ZHANG N M, WU W, ZHENG G F. Convergence of gradient method with momentum for two-layer feed-forward neural networks[J]. IEEE T Neural Networks, 2006, 17(2): 522-525. doi: 10.1109/TNN.2005.863460
    [14] WU W, ZHANG N M, ZHENG L, et al. Convergence of gradient method with momentum for backpropagation neural networks[J]. Journal of Computational Mathematics, 2008, 26(4): 613-623.
    [15] GORI M, MAGGINI M. Optimal convergence of online back-propagation[J]. IEEE T Neural Networks, 1996: 251-254.
    [16] YUAN Y X, SUN W Y. Optimization Theory and Methods[M]. Beijing: Science Press, 2001.
  • 加载中
计量
  • 文章访问数:  928
  • HTML全文浏览量:  454
  • PDF下载量:  28
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-03-26
  • 网络出版日期:  2020-12-16

目录

    /

    返回文章
    返回