高级检索

    基于注意力等变几何扩散模型的完整分子生成

    Complete molecular generation based on attention-equivariant geometric diffusion models

    • 摘要: 针对分子生成任务,本文提出了一种注意力等变几何扩散模型(AEGDM),该模型应用离散扩散与连续扩散混合生成的扩散框架,实现原子坐标、原子种类及化学键的完整生成。在AEGDM模型中,提出了注意力等变图神经网络(AEGNN)与残差增强神经网络(RBNN),分别用以提高模型的表征能力及加快模型训练速度。在QM9数据集上的实验结果表明,该方法在生成分子的唯一性与新颖性方面均取得显著提升,唯一性达到98.2%,新颖性达到70.9%。这一性能提升表明AEGDM在化学空间探索方面具有较强的表征能力,能够有效推动创新性候选分子的发现。

       

      Abstract: This paper proposes an attention-equivariant diffusion model (comprising AEGNN and RBNN) for molecular generation tasks. AEGNN leverages multi-head self-attention to jointly update edge features, atomic coordinates, and node features within the molecular graph. Under strict rotational and translational equivariance, it progressively reconstructs atom types, 3D structures, and chemical bonds through a reverse diffusion process. RBNN further enhances generative performance by co-training two models, both constructed by stacking identical basic modules (AEM in this paper). The Knowledge Generator (KG) employs a deeper stack to capture complex nonlinear relationships, generating high-precision initial results that fit the ground truth. In contrast, the Residual Refiner (RR) uses a shallower structure, focusing on fitting the residual between KG's output and the true result. This design reduces computational overhead while strengthening the model's correction capability. The two models are connected in a cascaded manner: the output of KG serves as input to RR, and the final prediction is obtained by adding RR's residual correction to the initial output of KG. Experiments on the QM9 dataset demonstrate significant improvements in the uniqueness and novelty of generated molecules, indicating that AEGDM can effectively explore the chemical space and facilitate the discovery of candidate molecules with innovative structures. Furthermore, the RBNN mechanism accelerates experimental iteration and enhances model performance, providing crucial technical support for iterative optimization of molecular generation models.

       

    /

    返回文章
    返回