ECVA | European Computer Vision Association

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation

Yushi Lan, Fangzhou Hong, Shuai Yang, Shangchen Zhou, Xuyi Meng, Bo Dai, Xingang Pan, Chen Change Loy* ;

Abstract

"The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image(s) into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves superior performance on Objaverse, ShapeNet and FFHQ for conditional 3D generation. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Video demos can be found on our project webpage: https://nirvanalan.github.io/projects/ ln3diff."

Related Material

[pdf] [supplementary material] [DOI]