高级检索

    基于图神经网络的单步逆合成集成研究

    Graph Neural Network-Based Ensemble Methods for Single-Step Retrosynthesis

    • 摘要: 化学逆合成分析是从目标产物分子反向推导至可商业获取前体分子的过程,为了解决现有单步化学逆合成方法在泛化能力与预测精度方面的局限,本文提出了一种融合分子指纹的图神经网络集成预测模型,解决了单一模型扩展性和可解释性不高的问题。该方法通过构建多维分子表征并动态选择最优预测策略,以提升逆合成预测的准确性与稳定性。模型基于消息传递神经网络(Message Passing Neural Network,MPNN)、全局自注意力机制及AttentiveFP网络提取分子结构特征,并引入扩展连接指纹(Extended Connectivity Fingerprints,ECFP)实现局部结构信息与全局分子特征的联合建模,构建了MPNN+FPS与AttentiveFP+FPS两种融合架构。在USPTO-50K和天然产物数据集中进行了大量实验验证了方法有效性。本文提出的集成逆合成预测方法在预测精度与泛化能力方面均表现出显著优势,为复杂分子与天然产物的逆合成分析提供了一种新的技术路径。

       

      Abstract: Retrosynthetic analysis is the process of deriving commercially available precursor molecules from target product molecules. To address the limitations of existing single-step chemical retrosynthetic methods in terms of generalization ability and prediction accuracy, this paper proposes an ensemble prediction model integrating molecular fingerprints and graph neural networks, solving the problems of low scalability and interpretability of single models. This method improves the accuracy and stability of retrosynthetic prediction by constructing multidimensional molecular representations and dynamically selecting the optimal prediction strategy. The model is based on message-passing neural networks (MPNN), global self-attention mechanisms, and attentive FP networks to extract molecular structural features, and introduces extended connectivity fingerprints (ECFP) to achieve joint modeling of local structural information and global molecular features, constructing two fusion architectures: MPNN+FPS and Attentive FP+FPS. Extensive experiments on the USPTO-50K and natural product datasets validated the effectiveness of the method. The proposed ensemble retrosynthetic prediction method demonstrates significant advantages in both prediction accuracy and generalization ability, providing a new technical approach for retrosynthetic analysis of complex molecules and natural products.

       

    /

    返回文章
    返回