Fume Hood Window State Recognition Method Based on Few Shot Deep Learning
摘要: 当实验人员离开化学实验室时，未及时关闭通风柜橱窗会造成严重的安全隐患以及能源浪费，且目前缺乏有效的信息化管理手段。本文利用计算机视觉技术非接触性、可扩展性强的优势，提出了基于小样本深度学习的通风柜橱窗状态识别方法。首先对监控视频进行预处理，基于运动特征和几何先验提取出通风柜橱窗区域；然后对改进的多尺度空洞原型网络进行训练，准确识别出通风柜橱窗的状态。在实际应用中，结合改进的人员检测算法有效减少了识别次数。经实验验证，该方法的准确率较卷积神经网络提升了10.95%，并且对光照变化的鲁棒程度较高，可有效满足化学实验室的日常安全管理要求。Abstract: In the chemical laboratory, the laboratory staff often forget to close the fume hood window in time before they leave, which may cause potential safety hazards and the waste of energy. Therefore, it is necessary to develop a methodology for the safety management of fume hood window. To the best of the authors’ knowledge, the related research works about window status recognition are mainly through various electronic control systems, which are not suitable for fume hood window. By using computer vision with the advantages of non-contact and easy expansibility, this paper proposes a novel safety management method for fume hood window. Firstly, the surveillance videos are preprocessed and these areas of fume hood window are extracted via motion features and geometric priority. This can effectively reduce the influence of irrelevant area on window status recognition. Due to the lack of available data set and the limitation of the number of fume hood windows in the laboratory, this paper constructs a new data set containing 400 window images. By using few-shot learning, this paper proposes a recognition method on the status of fume hood window. Compared with the traditional few-shot learning dataset, fume hood window images have higher resolution such that it is difficult to extract effective features. To overcome this significant challenge, this paper applies dilation convolution to enlarge receptive field and constructs the inception layer with multi-scale dilation rate instead of traditional convolution layer. In order to avoid invalid detection on the window status while staff are in the laboratory, we use the moving foreground region extracted from the Gauss mixture model as the prior region of Yolov3 (You only look once version 3) target detection such that the error recognition can be greatly reduced. In the simulation experiment, the proposed method is compared with the traditional machine learning algorithm and CNN(convolutional neural network). LBP (local binary pattern), PCA(principal component analysis), ColorHist(histogram of color) and HOG(histogram of oriented gradient) are selected as the features of machine learning methods from the aspects of texture, dimension reduction, color and shape. It is shown via the experimental results that the proposed method can achieve 99.29% accuracy under normal illumination conditions, 17.20% higher than the best traditional HOG combined with Randomforest method and 10.95% higher than the convolution neural network. Under the condition of illumination change, the accuracy is 95.74%, which is less changed than the one under normal illumination.
表 1 不同方法的准确率对比
Table 1. Accuracy of different methods
Algorithm Accuracy/% SVM Random forest LBP 51.30 69.90 PCA 57.10 64.76 ColorHist 75.94 47.56 HOG 57.10 82.12 CNN 88.34 ProtoNet 97.32 DProtoNet 99.29
表 2 光照变化下的准确率对比
Table 2. Accuracy under illumination changes
Algorithm Accuracy/% SVM Random forest LBP 50.94 60.69 PCA 56.29 50.77 ColorHist 50.21 52.32 HOG 60.19 72.56 CNN 77.25 ProtoNet 94.43 DProtoNet 95.74
表 3 不同空洞率组合下的准确率
Table 3. Accuracy under different dilation rate combination
Dilation rate Accuracy/% 1,2 98.25 1,2,3 99.29 1,2,3,4 98.90
 王燕, 王月荣, 熊焰, 等. 化学实验室安全管理体系的建设和实践[J]. 化学高等教育, 2018, 162(4): 75-78.  LI Z, GUO G, REN S. The detection system of an auto front left door electric window switch[C]//Proceedings of 2012 International Conference on Electronic Information and Electrical Engineering. China: Atlantis Press, 2012: 271-274.  孙宾, 王茂森, 戴劲松, 等. 基于CAN总线的家用门窗自动开关控制系统[J]. 兵器装备工程学报, 2011, 32(3): 82-86. doi: 10.3969/j.issn.1006-0707.2011.03.027  金晓磊, 潘鹏. 机器人视觉的电梯轿厢门状态识别系统[J]. 单片机与嵌入式系统应用, 2018, 18(4): 28-31.  丁四海, 刘玉雪, 路林吉. 数字图像处理技术在电气控制柜开关状态识别中的应用[J]. 微型电脑应用, 2013, 30(5): 39-40. doi: 10.3969/j.issn.1007-757X.2013.05.012  DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. USA: IEEE, 2009: 248-255.  KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. USA: IEEE, 2012: 1097-1105.  SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. arXiv.org, 2014-09-04[2019-04-01], arXiv.org/abs/1409.1556.  SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. USA: IEEE, 2015: 1-9.  HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. USA: IEEE, 2016: 770-778.  ZEILER M D, TAYLOR G W, FERGUS R. Adaptive deconvolutional networks for mid and high level feature learning[C]//2011 International Conference on Computer Vision. Spain: IEEE, 2011: 2018-2025.  HOWARD A G, ZHU M, CHEN B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. arXiv.org, 2017-04-17[2019-04-01]. arXiv/abs/1704.04861.  YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv.org, 2015-11-23[2019-04-01]. arXiv/abs/1511.07122.  KOCH G, ZEMEL R, SALAKHUTDIN R. Siamese neural networks for one-shot image recognition[C]//Proceedings of the International Conference on Machine Learning. France: ACM, 2015: 6-36.  VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Proceedings of the Advances in Neural Information Processing Systems. Spain: MIT Press, 2016: 3630-3638.  SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Proceedings of the Advances in Neural Information Processing Systems. USA: MIT Press, 2017: 4077-4087.  AHAD M A R, TAN J K, KIM H, et al. Motion history image: Its variants and applications[J]. Machine Vision and Applications, 2012, 23(2): 255-281. doi: 10.1007/s00138-010-0298-4  SUZUKI S, BE K. Topological structural analysis of digitized binary images by border following[J]. Computer Vision Graphics and Image Processing, 1985, 30(1): 32-46. doi: 10.1016/0734-189X(85)90016-7  REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. arXiv.org, 2018-04-08[2019-04-01]. arXiv/abs/1804.02767.