Abstract:
Although a DCS system can collect a large number of real time operating data from a process industry production site, only a small amount of these data can be used as auxiliary variables sample in soft sensor. Most dominant variables reflecting the quality indicators had to be detected via manual analysis or on line quality instrument for a long time. This not only brings the difficulty of collecting the training sample set for a soft sensor model, but also makes the most amounts of collected data via DCS systems not effectively utilized such that the accuracy of machine learning is affected. In this paper, a maximum entropy method is used to estimate the joint probability distribution of the variables for a soft sensor and a Bayesian maximum posteriori method integrating clustering analysis is applied to estimate the samples lacking manual analysis values. Simulation results show that the proposed method can effectively estimate the missing part of samples so that the numbers of samples can be added and the accuracy of the model training can be increased.