Multivariate Time Series Prediction Based on Clockwork Triggered Long Short Term Memory
-
摘要: 在现有的多元时间序列预测方法中,模型无法敏锐地捕获时间序列短期突变信号从而导致预测趋势滞后和误差较大。本文提出了一种基于时钟触发长短期记忆 (Clockwork Triggered Long Short Term Memory, CWTLSTM) 网络的多元时序预测模型,通过增强对短期信息的捕获能力提高了预测精度。CWTLSTM将网络中所有的神经元进行分组,对每个分组赋予不同的激活频率,每一组神经元只在时间步长等于周期的整数倍时才被激活。根据周期是否为1将网络分为主干网络链和短期输入增强链,短期输入增强链在靠近输出位置的时间步上激活时,将输入信息的运算结果单向地传递给主干网络链,增强此时的输入权重,使模型在存储长期信息的基础上能快速响应短期突变信息带来的数据波动。在空气污染数据集和水泥篦冷机数据集上的验证结果表明,本文模型在减少预测误差与趋势判断上均有较好的表现。Abstract: In multivariate time series prediction, it is difficult to capture short-term mutation during long time series, which leads to significant prediction errors. A short-term information enhancement model called clockwork triggered long short term memory (CWTLSTM) neural network is proposed in this paper. The new model groups neurons in the network and assigns different activation frequencies to each group. The neurons in each group can be activated only when the time step is equal to an integer multiple of their specified period. According to the number of the group period, the network is divided into backbone network chain and short-term input enhancement chain. When the short-term input enhancement chain is activated on the time step close to the output position, the input information at that point will be transmitted to the backbone network chain uniaxially, and the weight of short-term input data will be enhanced. So the model can quickly respond to the data fluctuation caused by short-term mutation information, on the basis of storing long-term information. The prediction performance of CWTLSTM was verified by air pollution data set and cement cooler data set, compared with LSTM, XGboost and CWRNN models. The results show that the proposed model has good performance in reducing forecasting error and forecasting future trend. In the experiment, the parameter sensitivity of the model to the periodic allocation strategy is also analyzed, which verifies the role of CWTLSTM in short-term information enhancement to a certain extent.
-
表 1 空气污染数据集上各模型的预测结果对比
Table 1. Comparison of prediction results of various models on the air pollution dataset
Model Timesteps=10 Timesteps=48 RMSE/
($\mathrm{m}\mathrm{g}\cdot {\mathrm{m} }^{3}$)MAE/
($\mathrm{m}\mathrm{g}\cdot{\mathrm{m} }^{3}$)MAPE/
%${{\boldsymbol{R}}} ^{2}$ RMSE/
($\mathrm{m}\mathrm{g}\cdot{\mathrm{m} }^{3}$)MAE/
($\mathrm{m}\mathrm{g}\cdot{\mathrm{m} }^{3}$)MAPE/
%${{\boldsymbol{R}}} ^{2}$ VARMAX 0.572 0.377 0.250 0.841 0.587 0.391 0.241 0.833 SVR 0.511 0.342 0.215 0.873 0.515 0.345 0.196 0.872 XGBoost 0.507 0.350 0.209 0.875 0.575 0.359 0.192 0.840 CWRNN 0.485 0.343 0.228 0.886 0.498 0.381 0.223 0.880 LSTM 0.483 0.330 0.215 0.887 0.465 0.322 0.221 0.895 CWTLSTM 0.478 0.322 0.205 0.889 0.425 0.302 0.216 0.912 表 2 二次风温度预测结果对比
Table 2. Comparison of prediction results of secondary air temperature
Model Timesteps=20 Timesteps=50 RMSE
/$ \mathrm{℃} $MAE
/$ \mathrm{℃} $MAPE
/%$ {\mathrm{R}}^{2} $ RMSE/
($ \mathrm{℃} $)MAE/
($ \mathrm{℃} $)MAPE/
%$ {\mathrm{R}}^{2} $ VARMAX 4.24 3.39 0.312 0.942 3.87 3.06 0.282 0.952 SVR 3.81 3.16 0.291 0.953 4.37 3.52 0.325 0.938 XGBoost 4.25 3.21 0.297 0.941 4.24 3.22 0.296 0.942 LSTM 2.50 1.93 0.176 0.973 2.72 2.29 0.211 0.976 CWRNN 2.32 1.89 0.174 0.982 2.30 1.80 0.164 0.983 CWTLSTM 2.23 1.71 0.156 0.983 1.67 1.33 0.122 0.991 -
[1] 任伟杰, 韩敏. 多元时间序列因果关系分析研究综述[J]. 自动化学报, 2021, 47(1): 64-78. [2] 辛治运, 顾明, 等. 基于最小二乘支持向量机的复杂金融时间序列预测[J]. 清华大学学报(自然科学版), 2008, 48(7): 82-84. [3] KIM J, SIM A. A new approach to multivariate network traffic analysis[J]. Journal of Computer Science and Technology, 2019, 34(2): 388-402. doi: 10.1007/s11390-019-1915-y [4] 蔡涛, 杨博, 李宏光. 基于时延挖掘模糊时间认知图的化工过程多变量时序预测方法[J]. 化工学报, 2020, 71(3): 1095-1102. [5] SHANG B, SHANG P. Directed vector visibility graph from multivariate time series: A new method to measure time series irreversibility[J]. Nonlinear Dynamics, 2021, 104(2): 1737-1751. doi: 10.1007/s11071-021-06340-3 [6] SINGHAL A, SEBORG D E. Clustering multivariate time-series data[J]. Journal of Chemometrics, 2010, 19(8): 427-438. [7] TiAO G C, TSAY R S. Model specification in multivariate time series[J]. Journal of the Royal Statistical Society B, 1989, 51(2): 157-213. [8] ZHANG P G. Time series forecasting using a hybrid ARIMA and neural network model[J]. Neurocomputing, 2003, 50(1): 159-175. [9] SHI J, DING Z H, LEE W J, et al. Hybrid forecasting model for very-short term wind power forecasting based on grey relational analysis and wind speed distribution features[J]. IEEE Transactions on Smart Grid, 2014, 5(1): 521-526. doi: 10.1109/TSG.2013.2283269 [10] DAHL A, BONILLA E V. Grouped Gaussian processes for solar power prediction[J]. Machine Learning, 2019, 108(8/9): 1287-1306. [11] TAY F, CAO L. Application of support vector machines in financial time series forecasting[J]. Omega, 2007, 29(4): 309-317. [12] HOSSEIN A, MOHSEN A, DAVOUD A, et al. Machine learning regression techniques for the silage maize yield prediction using time-series images of Landsat 8 OLI[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(12): 4563-4577. doi: 10.1109/JSTARS.2018.2823361 [13] 王宁, 曹萃文. 基于XGBoost模型的炼油厂氢气网络动态多输出预测模型[J]. 华东理工大学学报(自然科学版), 2020, 46(1): 77-83. [14] 庞向坤, 黄越, 王振, 等. 基于相关系数的多变量异常数据段的检测[J]. 控制工程, 2020, 27(1): 194-200. [15] ELMAN J L. Distributed presentations, simple recurrent networks, and grammatical structure[J]. Machine Learning, 1991, 7(2): 195-225. [16] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735 [17] 刘洋. 基于GRU神经网络的时间序列预测研究[D]. 成都: 成都理工大学, 2017. [18] JAN K, KLAUS G, FAUSTINO G. A clockwork RNN[C]//The 31st International Conference on International Conference on Machine Learning. Beijing, China: IEEE, 2014: 1863-1871. [19] HU J, ZHENG W. Transformation-gated LSTM: Efficient capture of short-term mutation dependencies for multivariate time series prediction tasks[C]//The International Joint Conference on Neural Networks (IJCNN). Budapest, Hungary: IEEE, 2019: 1-8. [20] CHEN H, WANG Y, ZHENG F, et al. Sparse modal additive model[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(6): 2373-2387. [21] ZHAO C, REN W, HAN M. Adaptive sparse quantization kernel least mean square algorithm for online prediction of chaotic time series[J]. Circuits, Systems, and Signal Processing, 2021, 40(1): 4346-4369. [22] HEWAGE P, BEHERA A, TROVATI M, et al. Temporal convolutional neural (TCN) network for an effective weather forecasting using time-series data from the local weather station[J]. Soft Computing, 2020, 24(1): 16453-16482. [23] LEWIS R, REINSEL G C. Prediction of multivariate time series by autoregressive model fitting[J]. Journal of Multivariate Analysis, 1985, 16(3): 393-411. doi: 10.1016/0047-259X(85)90027-2 [24] 韩敏, 任伟杰, 李柏松, 等. 混沌时间序列分析与预测研究综述[J]. 信息与控制, 2020, 49(1): 28-39. [25] 骆科东. 短时间序列挖掘方法研究[D]. 北京: 清华大学, 2004. [26] 伍仕屹. 短时间序列分析及其建模方法研究[D]. 贵阳: 贵州大学, 2016. [27] SHIH S Y, SUN F K, LEE H Y. Temporal pattern attention for multivariate time series forecasting[J]. Machine Learning, 2019, 108(1): 1421-1441. [28] PADILLA W R, J GARCIA, MOLINA J M. Improving time series forecasting using information fusion in local agricultural markets[J]. Neurocomputing, 2021, 452(34): 355-373. [29] IAN GOODFELLOW, YOSHUA BENGIO, AARON COURVILLE. Deep Learning[M]. USA: Massachusetts Institute of Technology Press, 2016: 378-380. [30] SAHIN S O, KOZAT S S. Nonuniformly sampled data processingusing LSTM networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(5): 1452-1461. doi: 10.1109/TNNLS.2018.2869822 [31] VITO S D, MASSERA E, PIGE M, et al. On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario[J]. Sensors & Actuators B Chemical, 2008, 129(2): 750-757. [32] LIVIERIS I E, STAVROYIANNIS S, PINTELAS E, et al. A novel validation framework to enhance deep learning models in time-series forecasting[J]. Neural Computing and Applications, 2020, 32(23): 17149-17167. doi: 10.1007/s00521-020-05169-y [33] NIELSEN U D, BRODTKORB A H, JENSEN J J. Response predictions using the observed autocorrelation function[J]. Marine Structures, 2018, 58(4): 31-52. [34] RALF O. Using neural nets in modelling vector time series[J]. Kybernetes, 1994, 23(9): 12-22. doi: 10.1108/03684929410074986 [35] HONG W C. Chaotic particle swarm optimization algorithm in a support vector regression electric load forecasting model[J]. Energy Conversion & Management, 2009, 50(1): 105-117. [36] CHATZIS S P, SIAKOULIS V, PETROPOULOS A, et al. Forecasting stock market crisis events using deep and statistical machine learning techniques[J]. Expert Systems with Applications, 2018, 112(12): 353-371. [37] ZHOU S, MAO M, ZHOU L, et al. A shadow fault diagnosis method based on the quantitative analysis of photovoltaic output prediction error[J]. IEEE Journal of Photovoltaics, 2020, 10(4): 1158-1165. doi: 10.1109/JPHOTOV.2020.2995041 [38] KE J, ZHENG H, YANG H, et al. Short-term forecasting of passenger demand under on-demand ride services: A spatio-temporal deep learning approach[J]. Transportation Research, 2017, 85(12): 591-608. [39] GENG Y, WANG X, JIANG P. Prediction of the cement grate cooler pressure in the cooling process based on a multi-model fusion neural network[J]. IEEE Access, 2020, 99(8): 115028-115040. [40] LI S, LIU W, LI R. The application of expert system and fuzzy control system in cement grate cooler system[C]. IEEE 7th International Conference on Software Engineering and Service Science. Beijing, China: IEEE, 2016: 770-773. -