Abstract:
Heterogeneous graphs can mine complex correlations in graph data, which is of great significance in the real world. Traditional Graph Neural Network (GNN) is usually limited to preset tasks and relies on clear labeled data and fixed training mechanism, which leads to its lack of flexibility when dealing with open-ended tasks. The existing research of GNN-LLM mostly focuses on homogeneous text attribute graphs, and does not consider the node heterogeneity. Aiming at the problems of heterogeneous feature representation space misalignment and open-domain task generalization, a multi-hop graphical reasoning mechanism based on heterogeneous perceptual graph learning joint contrastive learning is proposed. The model decouples and reconstructs heterogeneous subgraphs based on meta-path symmetry, and realizes the efficient fusion of topological embedding and semantic representation through a differentiated attention mechanism and a hierarchical feature aggregation algorithm. Aiming at the problem of modal alignment, a progressive phase optimization strategy is used to train the graph query converter, and a contrastive learning method is used to bridge the modal differences. The fine-grained feature association is established through self-supervised image-text matching, and the language modeling goal is integrated to promote the model to generate effective answers to the question. Experimental results show that the model has both predefined task adaptation and open scene generalization, and shows high quality reasoning ability for unseen questions in heterogeneous network question answering tasks.