Abstract:
This paper researches the SMS classification technology based on the improved Bayesian method. For Chinese SMS, the document frequency (DF) was adopted for feature selection, and the selfbuilt corpus was utilized to test the classifier. The results show that the improved classifier can increase the normal pass rate of SMS. Moreover, by using new training dataset, the personalized classifier can be obtained to adapt the changes of short message and meet the user's requirement. The proposed classifier can finish the filtering of message by combining black and white list filtering mechanism such that the error rate of normal SMS can be reduced.