高级检索

    一种标签融合驱动的中文医疗实体关系抽取方法

    A Chinese Medical Entity Relation Extraction Method Driven by Label Fusion

    • 摘要: 医疗实体关系抽取是推动医疗信息化建设的关键步骤,旨在从医疗文本中抽取结构化的三元组信息。针对现有方法对实体类型标签和关系标签利用不充分的问题,提出了一种标签融合驱动的中文医疗实体关系抽取框架。首先,将实体关系抽取任务拆分成双向的4个命名实体识别任务,并将每个任务的标签替换为头尾实体类型标签和关系标签的融合;其次,设计了三元组构造策略以最大限度利用双向抽取出的三元组;最后,利用三元组双向过滤模型筛选候选三元组。结果表明,该方法相较于GPLinker在F1指标上提升了3.01%。此外,该方法在医疗领域的重叠关系、多三元组和跨句三元组复杂场景中也表现出了优秀的性能。

       

      Abstract: The extraction of medical entity relationships is an key step in promoting medical informationization construction aiming at extracting structured triplet information from medical text. A label fusion-driven Chinese medical entity relation extraction framework is proposed to address the issue of insufficient utilization of entity type labels and relation labels in existing methods. Firstly, the entity relationship extraction task is split into four bidirectional named entity recognition tasks, and the labels of each task are replaced with a fusion of head and tail entity type labels and relation labels. Secondly, a triplet construction strategy is designed to maximize the utilization of the triplets extracted bidirectionally. Finally, a triplet bidirectional filtering model is utilized to filter the candidate triplets. The experimental results show that this method has improved the F1 index by 3.01% compared to GPLinker. In addition, this method has also demonstrated excellent performance in complex scenarios such as overlapping relationships, multiple triplets, and cross-sentence triplets in the medical field.

       

    /

    返回文章
    返回