POCA: Post-training Quantization with Temporal Alignment for Codec Avatars
Jian Meng*, Yuecheng Li*, Leo (Chenghui) Li, Syed Shakib Sarwar, Dilin Wang, Jae-sun Seo*
;
Abstract
"Real-time decoding generates high-quality assets for rendering photorealistic Codec Avatars for immersive social telepresence with AR/VR. However, high-quality avatar decoding incurs expensive computation and memory consumption, which necessitates the design of a decoder compression algorithm (e.g., quantization). Although quantization has been widely studied, the quantization of avatar decoders is an urgent yet under-explored need. Furthermore, the requirement of fast “User-Avatar” deployment prioritizes the post-training quantization (PTQ) over the time-consuming quantization-aware training (QAT). As the first work in this area, we reveal the sensitivity of the avatar decoding quality under low precision. In particular, the state-of-the-art (SoTA) QAT and PTQ algorithms introduce massive amounts of temporal noise to the rendered avatars, even with the well-established 8-bit precision. To resolve these issues, a novel PTQ algorithm is proposed for quantizing the avatar decoder with low-precision weights and activation (8-bit and 6-bit), without introducing temporal noise to the rendered avatar. Furthermore, the proposed method only needs 10% of the activations of each layer to calibrate quantization parameters without any distribution manipulations or extensive boundary search. The proposed method is evaluated on various face avatars with different facial characteristics. The proposed method compresses the decoder model by 5.3× while recovering the quality on par with the full precision baseline. In addition to the avatar rendering tasks, POCA is also applicable to image resolution enhancement tasks, achieving new SoTA image quality. https://mengjian0502.github.io/poca. github.io/"
Related Material
[pdf]
[supplementary material]
[DOI]