高级检索

  • ISSN 1006-3080
  • CN 31-1691/TQ

基于深度学习的驾驶场景关键目标检测与提取

张雪芹 魏一凡

张雪芹, 魏一凡. 基于深度学习的驾驶场景关键目标检测与提取[J]. 华东理工大学学报(自然科学版), 2019, 45(6): 980-988. doi: 10.14135/j.cnki.1006-3080.20181023002
引用本文: 张雪芹, 魏一凡. 基于深度学习的驾驶场景关键目标检测与提取[J]. 华东理工大学学报(自然科学版), 2019, 45(6): 980-988. doi: 10.14135/j.cnki.1006-3080.20181023002
ZHANG Xueqin, WEI Yifan. Deep Learning Based Key Object Detection and Extraction for Driving Scene[J]. Journal of East China University of Science and Technology, 2019, 45(6): 980-988. doi: 10.14135/j.cnki.1006-3080.20181023002
Citation: ZHANG Xueqin, WEI Yifan. Deep Learning Based Key Object Detection and Extraction for Driving Scene[J]. Journal of East China University of Science and Technology, 2019, 45(6): 980-988. doi: 10.14135/j.cnki.1006-3080.20181023002

基于深度学习的驾驶场景关键目标检测与提取

doi: 10.14135/j.cnki.1006-3080.20181023002
基金项目: 国家自然科学基金(31671006)
详细信息
    作者简介:

    张雪芹(1972-),女,副教授,博士,主要从事模式识别研究。E-mail:zxq@ecust.edu.cn

  • 中图分类号: TP391.4

Deep Learning Based Key Object Detection and Extraction for Driving Scene

  • 摘要: 包含目标识别与边界框选定的目标检测是无人驾驶视觉感知中的关键技术之一。采用基于深度计算机视觉组网络(VGGNet)的新型单次多框检测算法(SSD)进行驾驶环境中的关键目标检测、语义标注和目标框选;同时,针对具体驾驶场景,提出了改进的SSD_ARS算法。通过优化梯度更新算法、学习率下降策略和先验框生成策略,在提高平均检测精度的同时使得小目标类别的检测精度得到明显提升。在实际驾驶场景中9类关键目标的检测实验上验证了本文算法的有效性,实验结果表明,检测速度满足实时检测需求。

     

  • 图  1  VGG16网络结构

    Figure  1.  Network structure of VGG16

    图  2  SSD网络结构

    Figure  2.  Network structure of SSD

    图  3  阶梯式学习率下降示意图

    Figure  3.  Curve of stepped learning rate reduction

    图  4  模型训练

    Figure  4.  Model training

    图  5  模型测试

    Figure  5.  Model test

    图  6  实际驾驶场景中视频帧检测效果图

    Figure  6.  Detection of video frames in real driving scene

    表  1  训练集与测试集描述

    Table  1.   Description of training set and testing set

    No. Training set Testing set
    Object Image Object Image
    O1 3 169 579 745 140
    O2 401 213 90 51
    O3 270 195 39 25
    O4 372 206 92 49
    O5 460 174 138 45
    O6 670 276 125 62
    O7 454 236 110 53
    O8 648 322 149 73
    O9 243 165 30 20
    下载: 导出CSV

    表  2  算法检测精度

    Table  2.   Detection accuracy of the algorithms

    AlgorithmAPmAP
    O1O2O3O4O5O6O7O8O9
    SSD0.8350.8010.7580.7430.7290.6700.7410.7620.7330.752
    SSD_M0.8900.8650.8340.8150.8030.7340.8330.8370.8020.824
    下载: 导出CSV

    表  3  不同初始学习率下的检测精度与训练时间

    Table  3.   Detection accuracy and training time with different initial learning rates

    lsAP mAPt/h
    O1O2O3O4O5O6O7O8O9
    0.000 60.8830.8680.7450.8060.7810.7040.8260.8070.7900.80121
    0.000 80.8910.8590.8170.8130.8010.7260.8240.8120.7750.81333
    0.001 00.8900.8650.8340.8150.8030.7340.8330.8370.8020.824 40
    0.001 20.8870.8530.8220.7990.7940.7330.8150.8270.7790.81244
    下载: 导出CSV

    表  4  不同先验框长宽比生成策略下的检测精度

    Table  4.   Detection accuracy with different generation strategies of default box at different aspect ratio

    StrategyAP mAP
    O1O2O3O4O5O6O7O8O9
    S10.8840.8620.8320.8030.7960.7310.8340.8330.8040.819
    S20.9100.8840.8580.8340.8230.7920.8510.8640.8360.850
    S30.9020.8760.8470.8370.8110.7930.8480.8660.8310.846
    S40.9130.8970.8530.8410.8300.8010.8500.8710.8370.855
    下载: 导出CSV

    表  5  不同距离目标的检测精度

    Table  5.   Detection accuracy of objects at different distances

    ObjectAP mAP
    O1O2O3O4O5O6O7O8O9
    C 0.9740.9530.9280.8850.8040.8440.9100.8970.9020.900
    G0.8830.8750.8260.8160.6550.8070.8960.8930.8720.836
    F0.7840.7550.7210.7330.6420.6030.6370.7410.7880.712
    下载: 导出CSV

    表  6  实际驾驶场景中视频帧的检测精度

    Table  6.   Detection accuracy of video frames in real driving scenes

    Video frameAP mAPFrame rate
    O1O2O3O4O5O6O7O8O9
    V10.8810.7140.6730.8480.77920.03
    V20.8730.7120.8510.8570.8620.83120.05
    V30.8900.8520.8340.8010.8570.7250.82720.07
    下载: 导出CSV
  • [1] SERMANET P, LECUN Y. Traffic sign recognition with multi-scale convolutional networks[C]// International Joint Conference on Neural Networks. USA: IEEE, 2011: 2809-2813.
    [2] CHEN X, KUNDU K, ZHANG Z, et al. Monocular 3D object detection for autonomous driving[C]// IEEE Conference on Computer Vision and Pattern Recognition. USA: IEEE, 2016: 2147-2156.
    [3] UÇAR A, DEMIR Y, GÜZELIŞ C. Moving towards in object recognition with deep learning for autonomous driving applications[C]// International Symposium on Innovations in Intelligent Systems and Applications. USA: IEEE, 2016: 1-5.
    [4] CHEN Y, ZHAO D, LE L, et al. Multi-task learning for dangerous object detection in autonomous driving[J]. Information Sciences, 2018, 432: 559-571. doi: 10.1016/j.ins.2017.08.035
    [5] 许明文. 基于无人驾驶平台的交通灯及数字检测与识别系统[D].南京: 南京理工大学, 2017.
    [6] TIAN Y L, LUO P, WANG X G, et al. Pedestrian detection aided by deep learning semantic tasks[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015: 5079-5087.
    [7] 葛园园, 许有疆, 赵帅, 等. 自动驾驶场景下小且密集的交通标志检测[J]. 智能系统学报, 2018, 13(3): 366-372.
    [8] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector[C]// European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
    [9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]// International Conference on Neural Information Processing Systems. USA: MIT Press, 2015: 91-99.
    [10] GIRSHICK R. Fast R-CNN[C]// IEEE International Conference on Computer Vision. USA: IEEE, 2015: 1440-1448.
    [11] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. USA: IEEE, 2014: 580-587.
    [12] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations (ICLR). Hilton San Diego: Computer Science, 2015: 1150-1210.
    [13] BOUREAU Y L, PONCE J, LECUN Y. A theoretical analysis of feature pooling in visual recognition[C]// International Conference on Machine Learning. Israel: DBLP, 2010: 111-118.
    [14] NEUBECK A, Gool L V. Efficient non-maximum suppression[C]// International Conference on Pattern Recognition. USA: IEEE, 2006: 850-855.
    [15] BOTTOU L. Large-scale machine learning with stochastic gradient descent[C]// Proceedings of COMPSTAT’2010. Hamburg: Springer, 2010: 177-186.
    [16] LI M, ZHANG T, CHEN Y, et al. Efficient mini-batch training for stochastic optimization[C]// ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. USA: ACM, 2014: 661-670.
  • 加载中
图(6) / 表(6)
计量
  • 文章访问数:  7219
  • HTML全文浏览量:  2118
  • PDF下载量:  43
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-10-23
  • 网络出版日期:  2019-07-18
  • 刊出日期:  2019-12-01

目录

    /

    返回文章
    返回