Abstract:
Relation extraction is a key task in text data mining, whose technical goal is to mine triplet information composed of entities and semantic relations between them. The problem of overlapping relationships is currently a difficult issue in the field of relationship extraction. The hierarchical cascading tagging strategy has shown remarkable performance in solving this problem. However, BERT (Bidirectional Encoder Respresentations from Transformers) is only used as the feature input for the subject labeling module, and no relevant feature mining is performed on the identified subjects, resulting in the problem of missing features. Aiming at the above shortcoming, this paper proposes a dynamic hierarchical cascade tagging model for Chinese overlapping relation extraction (RWG-LSA). Firstly, a dynamic character-word fusion feature learning model (RWG) is constructed based on RoBERTa-wwm-ext (A Robustly Optimized BERT Pre-training Approach-whole word masking-extended), WoBERT (Word BERT) and gated mechanism, effectively avoiding the problems of missing features in the subject tagging module and inability to perform parallel computation. Secondly, dynamic weight local self-attention (LSA) is introduced to autonomously learn subject-level semantic information. Finally, on the basis of effectively integrating the global and local features of the input text, the RWG-LSA model is implemented to extract entity pairs and relationships from the text. The experimental results on the SKE Chinese dataset show that this model is significantly effective in extracting overlapping relationships, with an
F1 value of 82.44%.