Abstract:
Nanopore is a highly sensitive single-molecule detection technology, which researches the information of single molecule by capturing the change signal of ion current generated while the molecule traverses the nanopore. However, due to different capture rates of different molecules in the nanopore, the collected dataset is unbalanced, which will affect the accuracy of the molecule identification. Based on the blockage events of the encoded Generative Adversarial Networks (GAN) molecules, this paper constructs a Deep Convolutional Generative Adversarial Networks (DCGAN) based model to expand the minority samples, so as to achieve the balance processing of nanopore data set. In addition, QuipuNet is used to train and identify the data set before and after the balance. Finally, it is shown via the simulation results that the average classification accuracy of the trained QuipuNet for some “100” encoded molecules is improved by 14% after using DCGAN balanced dataset, and the average recognition accuracy rate is higher than those of other extended data sets methods. It is verified that DCGAN method can effectively improve the recognition accuracy of the actual signal after the model is trained by expanding the encoded DNA molecular data to balance the data set.