Sparse D-vine Copula-Based Modeling Approach and Its Application in Process Monitoring
-
摘要: 针对工业过程中高维数据的非线性非高斯问题,提出了一种基于稀疏D-vine Copula (Sparse D-vine Copula-based, SDVC)的过程监测方法。首先,针对传统的Vine Copula结构优化方法容易引起估计误差在Vine结构中累积,并且计算负担随着数据维数的增加急剧增长的问题,修正了二元Copula的先验概率,使得高层次结构树中的二元Copula更倾向于优化为独立状态,实现了高层次树结构稀疏优化。其次,对Vine结构节点次序确定方法进行改进,根据节点间的相关性总和依次展开,使其更适用于水平结构的D-vine建模。最后,引入高密度区域(HDR)与密度分位数理论,构建适用于任意分布的广义局部概率(GLP)指标,以实现对工业过程的实时监测。通过田纳西-伊斯曼(Tennessee-Eastman, TE)和醋酸脱水工业过程验证了所提出方法的优越性能。
-
关键词:
- 过程监测 /
- 相关性建模 /
- 非线性非高斯 /
- 稀疏D-vine Copula /
- 高密度区域
Abstract: Process monitoring is a crucial part of ensuring the safety and quality of industrial production. A sparse D-vine Copula-based (SDVC) process monitoring method is proposed for the problem of nonlinearity and non-Gaussian properties of high-dimensional data in industrial processes. Firstly, considering that the traditional Vine Copula structure optimization method tends to cause estimation errors to accumulate in the Vine structure and the computational burden grows sharply with the increase of data dimensionality. The prior probability of bivariate Copula is modified so that the bivariate Copula in high-level structure tree is more inclined to be optimized to independent states, and the sparse optimization of the high-level tree structure is achieved. Secondly, the Vine structure node order determination method is improved. It is expanded sequentially according to the sum of correlations among nodes, making it more applicable to D-vine modeling of horizontal structure. Finally, the high density region (HDR) and density quantile theory are introduced to determine the control boundary and construct generalized local probability (GLP) index to realize real-time monitoring of industrial processes. The superior performance of the proposed method was verified through the Tennessee-Eastman (TE) and acetic acid dehydration industrial processes. -
表 1 D-vine和SDVC方法的CPU耗时
Table 1. CPU time consuming of D-vine and SDVC methods
Method Time-consuming/s Offline modeling Online monitoring D-vine 278.700 0.104 SDVC 214.380 0.087 表 2 TE过程21个故障的检测率(CL=0.99)
Table 2. Fault detection rates for 21 faults of TE process (CL=0.99)
Fault
no.Detection rate/% T 2 SPE GLP PCA KPCA PCA KPCA D-vine SDVC 1 99.13 99.75 99.75 99.75 99.75 99.63 2 96.75 98.13 98.50 98.25 98.38 98.50 3 0.50 4.38 2.50 5.00 2.00 7.88 4 0.63 2.00 1.50 2.25 0.88 5.75 5 23.38 27.00 13.75 27.00 24.38 29.50 6 100.00 100.00 100.00 100.00 100.00 100.00 7 37.13 42.38 22.13 42.63 39.38 44.63 8 96.25 97.38 94.75 97.75 97.75 98.13 9 2.13 3.38 2.13 4.88 1.50 7.25 10 32.25 45.00 19.50 60.00 75.75 77.63 11 8.50 34.50 44.25 40.88 37.38 44.88 12 98.38 99.50 83.88 99.13 98.38 99.25 13 93.75 94.75 94.63 94.63 94.75 95.00 14 85.88 99.88 100.00 99.88 99.88 100.00 15 1.63 9.13 3.00 7.13 3.00 15.88 16 18.38 32.38 9.00 35.38 31.00 38.25 17 77.88 95.38 95.38 94.63 96.13 94.50 18 89.25 89.88 90.00 89.88 89.88 90.00 19 11.38 4.13 6.63 6.63 23.00 22.13 20 25.13 45.00 37.75 50.63 77.50 78.75 21 42.00 44.63 43.00 49.75 47.88 50.25 表 3 醋酸脱水过程的检测率和误报率(CL=0.98)
Table 3. FAR and FDR of the acetic acid dehydration process (CL=0.98)
Method FDR/% FAR/% T 2 SPE GLP T 2 SPE GLP PCA 100 100 — 4.00 1.00 — KPCA 100 100 — 5.50 1.50 — D-Vine — — 100 — — 2.00 SDVC — — 100 — — 0.50 -
[1] 何雨旻, 侍洪波. 基于多块卷积变分信息瓶颈的多变量动态过程故障诊断[J]. 华东理工大学学报(自然科学版), 2021, 47(6): 716-725. [2] 邬东辉, 顾幸生. 基于自适应稀疏表示和保局投影的工业故障检测[J]. 华东理工大学学报(自然科学版), 2021, 47(4): 455-464. [3] 刘强, 卓洁, 郎自强, 等. 数据驱动的工业过程运行监控与自优化研究展望[J]. 自动化学报, 2018, 44(11): 1944-1956. [4] GE Z Q. Review on data-driven modeling and monitoring for plant-wide industrial processes[J]. Chemometrics and Intelligent Laboratory Systems, 2017, 171: 16-25. doi: 10.1016/j.chemolab.2017.09.021 [5] WOLD S, ESBENSEN K, GELADI P. Principal component analysis[J]. Chemometrics and Intelligent Laboratory Systems, 1987, 2(1): 37-52. [6] GAUTHIER J L, MANOLESCU P, DUCHESNE C. The sequential multi-block PLS algorithm (SMB-PLS): Comparison of performance and interpretability[J]. Chemometrics and Intelligent Laboratory Systems, 2018, 180: 72-83. doi: 10.1016/j.chemolab.2018.07.005 [7] CAI P P, DENG X G. Incipient fault detection for nonlinear processes based on dynamic multi-block probability related kernel principal component analysis[J]. ISA Transactions, 2020, 105: 210-220. doi: 10.1016/j.isatra.2020.05.029 [8] ZHANG Y, HU Z. Multivariate process monitoring and analysis based on multi-scale KPLS[J]. Chemical Engineering Research and Design, 2011, 89(12): 2667-2678. doi: 10.1016/j.cherd.2011.05.005 [9] KANO M, TANAKA S, HASEBE S, et al. Monitoring independent components for fault detection[J]. AIChE Journal, 2003, 49: 969-976. doi: 10.1002/aic.690490414 [10] YU J, QIN S. Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models[J]. AIChE Journal, 2008, 54(7): 1811-1829. doi: 10.1002/aic.11515 [11] WEI Y, ZHANG S. Dependence analysis of finance markets: Copula-garch model and its application[J]. Systems Engineering, 2004, 4: 7-12. [12] MADADGAR S, MORADKHANI H. Drought analysis under climate change using Copula[J]. Journal of Hydrologic Engineering, 2011, 18(7): 746-759. [13] GENEST C, FAVRE A C. Everything you always wanted to know about Copula modeling but were afraid to ask[J]. Journal of Hydrologic Engineering, 2007, 12(4): 347-368. doi: 10.1061/(ASCE)1084-0699(2007)12:4(347) [14] JOE H. Families of m-variate distributions with given margins and m(m−1)/2 bivariate dependence parameters[J]. Distributions with Fixed Marginals & Related Topics Lecture Notesmonograph, 1996, 28: 120-141. [15] REN X, TIAN Y, LI S J. Vine copula-based dependence description for multivariate multimode process monitoring[J]. Industrial & Engineering Chemistry Research, 2015, 54(41): 10001-10019. [16] 周南, 李绍军. 基于核密度估计的R-Vine Copula选择及其在故障检测中的应用[J]. 高校化学工程学报, 2019, 33(2): 443-452. doi: 10.3969/j.issn.1003-9015.2019.02.024 [17] 崔群, 李绍军. 基于伯恩斯坦多项式和D-vine Copula的过程监控方法[J]. 高校化学工程学报, 2021, 35(1): 118-126. doi: 10.3969/j.issn.1003-9015.2021.01.014 [18] NAGLER T, BUMANN C. Model selection in sparse high-dimensional vine Copula models with an application to portfolio risk[J]. Journal of Multivariate Analysis, 2019, 172: 180-192. doi: 10.1016/j.jmva.2019.03.004 [19] ZHOU Y, REN X, LI S J. Probabilistic weighted Copula regression model with adaptive sample selection strategy for complex industrial processes[J]. IEEE Transactions on Industrial Informatics, 2020, 16(11): 6972-6981. doi: 10.1109/TII.2020.2972813 [20] SKALR A. Fonctions dé repartition án dimension et leurs marges[J]. Publications Del'Institut de Statistique de L'Université de Paris, 1959, 8: 229-231. [21] BEDFORD T, COOKE R M. Probability density decomposition for conditionally dependent random variables modeled by vines[J]. Annals of Mathematics and Artificial Intelligence, 2001, 32(1): 245-268. [22] KOVACS E, SZANTAI T. On the Connection Between Cherry-Tree Copulas and Truncated R-Vine Copulas [M]. Kybernetika: [s.n.]. 2016, 53(3): 437-460. [23] AAS K, CZADO C. Pair-Copula constructions of multiple dependence[J]. Insurance Mathematics & Economics, 2009, 44(2): 182-198. [24] BOWMAN A W. An alternative method of cross-validation for the smoothing of density estimates[J]. Biometrika, 1984, 71(2): 353-360. doi: 10.1093/biomet/71.2.353 [25] HYNDMAN R. Computing and graphing highest density regions[J]. Journal of the American Statistical Association, 1996, 50(2): 120-126. [26] BRECHMANN E, SCHEPSMEIER U. Modeling dependence with C-and D-vine Copulas: The R package CD-vine[J]. Journal of Statistical Software, 2013, 52(3): 1-27. [27] 曾根保, 李绍军, 钱锋. 醋酸脱水系统的动态模拟及其控制[J]. 计算机与应用化学, 2008, 25(5): 533-536. doi: 10.3969/j.issn.1001-4160.2008.05.005 -