ECVA | European Computer Vision Association

CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering

Haidong Zhu, Tianyu Ding*, Tianyi Chen, Ilya Zharkov, Ram Nevatia, Luming Liang ;

Abstract

"Generalizability and few-shot learning are key challenges in Neural Radiance Fields (NeRF), often due to the lack of a holistic understanding in pixel-level rendering. We introduce CaesarNeRF, an end-to-end approach that leverages scene-level CAlibratEd SemAntic Representation along with pixel-level representations to advance few-shot, generalizable neural rendering, facilitating a holistic understanding without compromising high-quality details. CaesarNeRF explicitly models pose differences of reference views to combine scene-level semantic representations, providing a calibrated holistic understanding. This calibration process aligns various viewpoints with precise location and is further enhanced by sequential refinement to capture varying details. Extensive experiments on public datasets, including LLFF, Shiny, mip-NeRF 360, and MVImgNet, show that CaesarNeRF delivers state-of-the-art performance across varying numbers of reference views, † proving effective even with a single reference image. ∗ Equal contribution. Corresponding author. This work was done when Haidong Zhu was an intern at Microsoft."

Related Material

[pdf] [supplementary material] [DOI]