带动量项的梯度下降算法的收敛性

Convergence of Gradient Descent Algorithm with Momentum

摘要: 本文对基于三层前馈神经网络的带动量项的反向传播算法进行了理论分析。在我们的模型中，学习率为常数，动量系数为一个适应性的变量。本文给出了带动量项的反向传播算法的收敛性结果及详细的证明。相比于目前已有的结果，本文中的结论更具有一般性。

Abstract: At present, neural networks have been widely used, and have achieved some success in many fields. However, there is not much theoretical analysis about neural networks. This paper analyzed the convergence of the back-propagation algorithm with momentum for the three-layer feed-forward neural networks. In our model, the learning rate is set to be a constant, and the momentum coefficient is set as an adaptive variable to accelerate and stabilize the training procedure of network parameters. The corresponding convergence results and detailed proofs are given. Compared with the existing results, our results are more general.