面向手语识别的视频关键帧提取和优化算法

周舟; 韩芳; 王直杰

doi:10.14135/j.cnki.1006-3080.20191201002

面向手语识别的视频关键帧提取和优化算法

Video Key Frame Extraction and Optimization Algorithm for Sign Language Recognition

摘要

摘要: 基于计算机视觉的手语识别技术可以为聋校的双语教学带来很大的便利，而手语识别技术的难点之一在于视频关键帧的提取。根据手语视频关键帧的特点和手语者的手语习惯，提出了一种面向手语识别的视频关键帧提取和优化算法。首先利用卷积自编码器提取视频帧的深度特征，对其进行K-means聚类，在每类视频帧中采用清晰度筛选取出最清晰的视频帧作为初次提取的关键帧；然后利用点密度方法对初次提取的关键帧进行二次优化，得到最终提取的关键帧进行手语识别。实验结果表明，本文算法能大量消除冗余帧，并能提高手语识别的准确率和效率。

Abstract: The sign language recognition technology based on computer vision brings great convenience to bilingual teaching in deaf schools. One of the difficulties of sign language recognition technology is the extraction of video key frames. According to the characteristics of sign language video keyframes and sign language habits of sign language users, this paper proposes a video key frame extraction and optimization algorithm for sign language recognition. Firstly, the convolutional auto-encoder is used to extract the deep features of video frames, and then, K-means clustering is performed. In each kind of video frames, the clearest video frames are selected as the keyframes for the first time. Then, the point density method is used to optimize the first extracted key frames, and the final extracted key frames are obtained for sign language recognition. Finally, it is shown via experimental results that the proposed algorithm can reduce substantial redundant frames and improve the accuracy and efficiency of sign language recognition.

HTML全文

参考文献(15)

施引文献

资源附件(0)