基于脑电信号的三维场景重建方法

邓霞; 陈屾; 王慧锋; 顾震; 颜秉勇; 周家乐

doi:10.14135/j.cnki.1006-3080.20260408001

基于脑电信号的三维场景重建方法

Brain Signal-Based 3D Scene Reconstruction Method

摘要

摘要: 脑电信号驱动的三维重建是脑机接口领域的重要研究方向。针对 EEG 信号信噪比低、空间分辨率有限，难以直接支撑复杂三维结构重建的问题，本文提出一种基于脑电信号的三维场景重建框架 Mind2Matter。该框架首先利用 EEG 编码器提取脑电时空特征，并结合大语言模型完成语义解码；随后以生成文本为条件，引入布局约束的三维高斯生成模型，实现从脑电信号到三维场景的跨模态重建。实验结果表明，本文方法在 EEG 到文本生成任务中取得 34.21% 的 ROUGE-1 F 值、34.33% 的 BLEU-1 值和 37.19% 的 BERTScore F 值；在三维重建任务中，CLIP Similarity 达到 0.701，CD 和 EMD 分别为 4.66 和 10.93，优于对比方法。结果验证了 EEG 驱动三维重建的可行性。

Abstract: Brain signal-driven three-dimensional (3D) reconstruction is an important research direction in the field of Brain-Computer Interfaces (BCIs). Existing studies mainly rely on functional magnetic resonance imaging (fMRI), which provides high spatial resolution but suffers from high cost and poor real-time performance. Compared with fMRI, electroencephalography (EEG) is low-cost, non-invasive, and has high temporal resolution, making it more suitable for practical interactive applications. However, the low signal-to-noise ratio and limited spatial resolution of EEG make it difficult to reconstruct complex 3D scenes directly from neural signals. To address this issue, this paper proposes a novel EEG-driven 3D scene reconstruction framework named Mind2Matter. The proposed framework adopts a two-stage pipeline. In the first stage, a hierarchical EEG encoder is designed to extract discriminative spatiotemporal features from EEG signals. A large language model (LLM) is then introduced to generate semantic text descriptions through cross-modal semantic alignment between EEG features and visual embeddings. In the second stage, the generated text descriptions are used as conditions for a layout-constrained 3D Gaussian Splatting framework to reconstruct semantically consistent 3D scenes. Experimental results on the EEG-Image dataset demonstrate that the proposed method achieves superior performance in both semantic understanding and geometric reconstruction. Specifically, the method obtains 34.21% ROUGE-1 F-score, 7.62% BLEU-4, and 37.19% BERTScore F-score in the EEG-to-text generation task. In the 3D reconstruction task, the proposed framework achieves a CLIP Similarity score of 0.701, while the Chamfer Distance (CD) and Earth Mover’s Distance (EMD) are reduced to 4.66 and 10.93, respectively, outperforming existing methods. The results verify the feasibility of EEG-driven 3D reconstruction and provide a new solution for brain-computer interaction and virtual reality applications.

HTML全文

参考文献(41)

施引文献

资源附件(0)