Abstract:
Fine grained image classification is a challenging task in computer vision, whose aim is to recognize images that belong to the same basic category but not the same class or subcategory. This image classification is intractable for two reasons. The first one is that the images of the same class obtained from real world contain different environment, illumination, and postures. These differences will result in high intra-class variances. The other is that some classes under the same basic category look very similar to each other, which results in low inter-class differences. Traditional fine grained image classification approaches divide the input image into many overlapping patches. Thus, these patches may contain huge amount of redundant information. Local region descriptors extracted from these patches will require a lot of computation time. Recently, convolutional neural network (CNN) has been widely used in fine grained images classification and has shown its effectiveness when dealing with large amount of image. However, it is usually difficult for CNN to obtain the qualified annotations. Although the bounding box regression may reduce the influence of lacking annotations, it may inevitably contain background or noisy parts. Hence, this paper proposes a selective convolutional descriptor with mean only maximum a posterior adaption via GMM (SCD-MGMM), which can effectively deal with the shortcoming of intra-class and inter-class variances. Firstly, the convolutional features are chosen from entire images by using SCD approach so that the features extracted from a pertained VGG 16 model has stronger robustness to noise. Thus, the proposed SCD-MGMM can automatically locate the main object without any supervised information or extra fine grained images training. Besides, the proposed framework utilizes the mean only maximum a posterior adaption based on GMM (MGMM) to overcome the shortcoming that GMM requires a lot of observation data. Finally, this paper adopts a fast linear scoring technique to compute the log-likelihood. It has been shown from quantitative experiment results that the proposed method can attain better fine grained classification.