ECVA | European Computer Vision Association

High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering

Xin Ming, Jiawei Li, Jingwang Ling, Libo Zhang, Feng Xu* ;

Abstract

"Mesh-based facial blendshapes have been widely used in animation pipelines, while recent advancements in neural geometry and appearance representations have enabled high-quality inverse rendering. Building upon these observations, we introduce a novel technique that reconstructs mesh-based blendshape rigs from single or sparse multi-view videos, leveraging state-of-the-art neural inverse rendering. We begin by constructing a deformation representation that parameterizes vertex displacements into differential coordinates with tetrahedral connections, allowing for high-quality vertex deformation on high-resolution meshes. By constructing a set of semantic regulations in this representation, we achieve joint optimization of blendshapes and expression coefficients. Furthermore, to enable a user-friendly multi-view setup with unsynchronized cameras, we use a neural regressor to model time-varying motion parameters. Experiments demonstrate that, with the flexible input of single or sparse multi-view videos, we reconstruct personalized high-fidelity blendshapes. These blendshapes are both geometrically and semantically accurate, and they are compatible with industrial animation pipelines. Code and data are available at https://github. com/grignarder/high-quality-blendshape-generation."

Related Material

[pdf] [supplementary material] [DOI]