Unsupervised Learning of Disentangled 3D Representation from a Single Image

Lv, JunliangJiang, HaiyongXiao, JunBittner, Jirí and Waldner, Manuela2021-04-092021-04-092021978-3-03868-134-21017-4656https://doi.org/10.2312/egp.20211030https://diglib.eg.org:443/handle/10.2312/egp20211030Learning 3D representation of a single image is challenging considering the ambiguity, occlusion, and perspective project of an object in an image. Previous works either seek image annotation or 3D supervision to learn meaningful factors of an object or employ a StyleGAN-like framework for image synthesis. While the first ones rely on tedious annotation and even dense geometry ground truth, the second solutions usually cannot guarantee consistency of shapes between different view images. In this paper, we combine the advantages of both frameworks and propose an image disentanglement method based on 3D representation. Results show our method facilitates unsupervised 3D representation learning while preserving consistency between images.Computing methodologiesImage representationsReconstructionMesh modelsUnsupervised Learning of Disentangled 3D Representation from a Single Image10.2312/egp.2021103011-12