Abstract:
At present,the models used in music auto-tagging are mostly hand-engineered,so the choice of the optimal feature is always difficult.We propose an unsupervised feature learning algorithm,which can automatically learn the underlying structure of feature without prior knowledge.The algorithm is achieved in three stages.The preprocessing stage extracts the chroma-frequency spectrogram,and reduces the dimensionality via PCA whitening.The second stage applies the denoising autoencoder to the reduced feature in an unsupervised manner,and aggregates a new feature vector by max-pooling function and averaging.The last stage maps the feature vector to song labels by pre-trained multilayer perceptron (MLP) in a supervised manner.The result based on the Magnatagatune and GTZAN datasets shows that our algorithm improves the accuracy of music auto-tagging to some degree.