A Text-Driven Continuous Terrain Generation Method for Virtual Scenes Based on Large Language Models
-
-
Abstract
With the rapid advancement of three-dimensional technologies including game development, Virtual Reality (VR) and digital twins, there is an increasingly urgent demand for the automatic generation of high-quality, large-scale continuous virtual terrain scenes. Conventional manual interactive workflows involve cumbersome operations and high professional thresholds. Although procedural generation methods can improve efficiency, they still rely on graphical interfaces and the tuning of specialized parameters, making it difficult to flexibly capture users’ semantic intentions. This paper proposes an automatic text-driven generation method for continuous virtual terrain based on the Large Language Model (LLM), which directly drives the generation pipeline through natural language to realize efficient construction of continuous virtual terrain scenes. A multi-stage task decomposition strategy is adopted in the proposed method to split the terrain generation pipeline into seven steps. By integrating prompt engineering and procedural generation techniques, an end-to-end mapping from text to large-scale continuous 3D terrain is established. To address the limitations of LLMs in complex tasks such as defects in multi-hop reasoning, output randomness and memory loss, mechanisms including cross-stage key information transmission, error retry and historical log learning are designed to improve the reliability and consistency of the system. A prototype system developed on the unity engine verifies the effectiveness of the proposed method. Test results on four typical terrain themes demonstrate that the Full Step System (FSS) achieves an average error rate of only 12.0%, which is significantly reduced by approximately 19–47 percentage points compared with the comparative systems without core mechanisms. In terms of efficiency, the average time consumption for generating a single preview terrain tile (1024 × 1024 vertex mesh) is about 42.14 s with controllable resource consumption. Further comparative experiments prove that the virtual terrain generated by the proposed method meets the expected requirements in generation efficiency, visual consistency and scalability. The proposed method remarkably improves generation efficiency and usability, offering a feasible solution for non-professional users to rapidly construct high-quality, large-scale 3D terrain via natural language.
-
-