高级检索

    张利, 张欢欢, 袁玉波. 中文重叠关系抽取的动态分层级联标记模型[J]. 华东理工大学学报(自然科学版). DOI: 10.14135/j.cnki.1006-3080.202302210
    引用本文: 张利, 张欢欢, 袁玉波. 中文重叠关系抽取的动态分层级联标记模型[J]. 华东理工大学学报(自然科学版). DOI: 10.14135/j.cnki.1006-3080.202302210
    Zhang Li, Zhang Huanhuan, Yuan Yubo. Dynamic Hierarchical Cascade Tagging Model for Chinese Overlapping Relation Extraction[J]. Journal of East China University of Science and Technology. DOI: 10.14135/j.cnki.1006-3080.202302210
    Citation: Zhang Li, Zhang Huanhuan, Yuan Yubo. Dynamic Hierarchical Cascade Tagging Model for Chinese Overlapping Relation Extraction[J]. Journal of East China University of Science and Technology. DOI: 10.14135/j.cnki.1006-3080.202302210

    中文重叠关系抽取的动态分层级联标记模型

    Dynamic Hierarchical Cascade Tagging Model for Chinese Overlapping Relation Extraction

    • 摘要: 构建了动态分层级联标记中文重叠关系抽取模型(RWG-LSA)。首先基于预训练语言模型和gated机制构建了动态字词融合特征学习模型(RWG),有效避免了主体标记模块的特征缺失和无法并行计算等问题;其次引入动态权局部自注意力(LSA),自主学习到主体层面的语义特征;最后在有效融合了输入序列的全局和主体局部特征的基础上,实现RWG-LSA模型对文本中实体对和关系的抽取。在SKE中文数据集上的实验表明,本模型对重叠关系抽取有显著效果,F1值达到了82.44%。

       

      Abstract: Relation extraction is a key task in text data mining, where its technical goal is to mine a triples consisting of entities and semantic relations between them. The problem of overlapping relations is the current difficulty in extraction relation, where hierarchical cascade tagging has a remarkable performance in solving this problem. But there are missing features in existing methods of this strategy because only BERT (Bidirectional Encoder Representations from Transformers) is used as the feature input for the subject tagging module on the one hand, and no relevant feature mining is done for the identified subjects on the other hand. In response, this paper proposes a dynamic hierarchical cascade tagging model for overlapping relation extraction. Firstly, a dynamic character-word fusion feature learning model (RWG) is constructed based on RoBERTa-wwm-ext (A Robustly Optimized BERT Pre-training Approach-whole word masking-extended), WoBERT (Word BERT) and gated mechanism. This effectively avoids the problems of missing features in the subject tagging module and inability to compute in parallel; secondly, the introduction of a dynamic weight local self-attention (LSA) autonomously learns subject-level semantic information; finally, based on an effective fusion of global and subject local features of the input text, the RWG-LSA model is implemented for the extraction of entity pairs and relations in the text. The experiments on the SKE Chinese dataset show that the model is significantly effective in extracting overlapping relations, with the F1 value reaching 82.44%.

       

    /

    返回文章
    返回