ECVA | European Computer Vision Association

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

Xiaofeng Wang*, Zheng Zhu, Guan Huang, Chen Xinze, Jiagang Zhu, Jiwen Lu ;

Abstract

"World models, especially in autonomous driving, are trending and drawing extensive attention due to their capacity for comprehending driving environments. The established world model holds immense potential for the generation of high-quality driving videos, and driving policies for safe maneuvering. However, a critical limitation in relevant research lies in its predominant focus on gaming environments or simulated settings, thereby lacking the representation of real-world driving scenarios. Therefore, we introduce DriveDreamer, a pioneering world model entirely derived from real-world driving scenarios. Regarding that modeling the world in intricate driving scenes entails an overwhelming search space, we propose harnessing the powerful diffusion model to construct a comprehensive representation of the complex environment. Furthermore, we introduce a two-stage training pipeline. In the initial phase, DriveDreamer acquires a deep understanding of structured traffic constraints, while the subsequent stage equips it with the ability to anticipate future states. Extensive experiments are conducted to verify that DriveDreamer empowers both driving video generation and action prediction, faithfully capturing real-world traffic constraints. Furthermore, videos generated by DriveDreamer significantly enhance the training of driving perception methods."

Related Material

[pdf] [supplementary material] [DOI]