ECVA | European Computer Vision Association

High-Fidelity Synthesis with Disentangled Representation

Wonkwang Lee, Donggyun Kim, Seunghoon Hong, Honglak Lee ;

Abstract

Learning disentangled representation of data without supervision is an important step towards improving the interpretability of generative models. Despite recent advances in disentangled representation learning, existing approaches often suffer from the trade-off between representation learning and generation performance (i.e. improving generation quality sacrifices disentanglement performance). We propose an Information-Distillation Generative Adversarial Network (ID-GAN), a simple yet generic framework that can easily incorporate the existing state-of-the-art models for both disentanglement learning and high-fidelity synthesis. Our method learns disentangled representation using VAE-based models, and distills the learned representation together with additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis. To ensure that both generative models are aligned to render the same generative factors, we additionally constrain the GAN generator to maximize the mutual information between the learned latent code and the output. Despite the simplicity, we show that the proposed method is surprisingly effective, achieving comparable image generation quality to the state-of-the-arts using the disentangled representation. We also show that the proposed decomposition leads to efficient and stable model design, and demonstrate it on high-resolution image synthesis task (1024x1024 pixels) for the first time using the disentangled representations."

Related Material

[pdf]