Advanced Search

    MENG Yifan, CHEN Ning, LI Hongkai. Chinese Dialect Identification Based on Local and Global Feature Fusion and Multi-Level Feature Aggregation[J]. Journal of East China University of Science and Technology, 2024, 50(6): 898-904. DOI: 10.14135/j.cnki.1006-3080.20231011003
    Citation: MENG Yifan, CHEN Ning, LI Hongkai. Chinese Dialect Identification Based on Local and Global Feature Fusion and Multi-Level Feature Aggregation[J]. Journal of East China University of Science and Technology, 2024, 50(6): 898-904. DOI: 10.14135/j.cnki.1006-3080.20231011003

    Chinese Dialect Identification Based on Local and Global Feature Fusion and Multi-Level Feature Aggregation

    • Compared to dialects in other languages, there are a wide variety of dialects with small inter-class differences but large intra-class differences in China. Therefore, Chinese dialect identification poses significant challenges. Considering that the differences between Chinese dialects may manifest in both local (short-term) and global (long-term) speech characteristics, as well as in different hierarchical levels of speech, this paper proposes a Chinese dialect identification model that integrates the extraction of both local and global speech features and the aggregation of multi-level features. Specifically, this paper first extracts the local features of speech using Res2Block, then utilizes Conformer to extract the global features of speech, and finally aggregates multi-level features by cascading the outputs of multiple Conformers. Experimental results on both unseen domain and seen domain settings demonstrate that the proposed model achieves higher recognition accuracy compared to the baseline model.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return