基于自上而下注意力机制的零样本目标检测

齐鑫伟; 侍洪波; 宋冰; 陶阳

doi:10.14135/j.cnki.1006-3080.20231214001

基于自上而下注意力机制的零样本目标检测

Zero-Shot Object Detection Based on Top-Down Attention Mechanism

摘要

摘要: 由于可见类和未见类目标数据分布的差异性，目前基于映射迁移策略的零样本目标检测算法在测试时容易偏向可见类别的目标，且因为不同类别在属性上的相似性，特征分布比较混乱。本文提出一种新的零样本目标检测框架，利用所设计的先验知识提取模块和自上而下注意力机制模块，为检测过程提供任务导向，引导模型在训练期间关注出现的未见类特征，提高模型对不同数据分布的判别性；还设计了一种新的对比约束以提高特征之间的聚类能力；在MSCOCO标准数据集上进行了大量实验。结果表明，该模型在标准和广义零样本目标检测任务上都取得了显著效果。

Abstract: Zero-shot object detection (ZSD) for identifying and localizing target classes that do not appear in the training data has become a new challenge in computer vision. Despite the current rapid development of ZSD methods, most existing methods are based on strict mapping transfer strategies to recognize invisible class objects. These models ignore the semantic information of invisible classes such that the misclassification phenomenon are yielded, as during the testing process, the detection results are prone to bias towards visible classes. At the same time, due to the similarity in attributes among different categories, the distribution of mapping features is relatively chaotic. To address the above problems, this paper proposes a zero-shot object detection framework based on a top-down attention mechanism. Specifically, a prior knowledge extraction module is constructed to generate prior knowledge related to the final detection task for each proposal. And then, a top-down attention mechanism module is utilized to fuse the mapping features with the prior knowledge, which can provide a task orientation for the detection process and guide the model to pay attention to the possible invisible features during the training, preventing them from being simply classified as background information, and thereby mitigating the domain shift problem. In addition, a new contrast constraint is designed to improve the discriminative and clustering ability between the mapping features. Finally, we conduct extensive experiments on MSCOCO, one standard datasets, from which it is shown that the proposed method can achieve significant effects on both ZSD and generalized ZSD (GZSD) tasks.

HTML全文

参考文献(52)

施引文献

资源附件(0)