Learning to Generate and Manipulate 3D Radiance Field by a Hierarchical Diffusion Framework with CLIP Latent

dc.contributor.authorWang, Jiaxuen_US
dc.contributor.authorZhang, Ziyien_US
dc.contributor.authorXu, Renjingen_US
dc.contributor.editorChaine, Raphaëlleen_US
dc.contributor.editorDeng, Zhigangen_US
dc.contributor.editorKim, Min H.en_US
dc.date.accessioned2023-10-09T07:34:01Z
dc.date.available2023-10-09T07:34:01Z
dc.date.issued2023
dc.description.abstract3D-aware generative adversarial networks (GAN) are widely adopted in generating and editing neural radiance fields (NeRF). However, these methods still suffer from GAN-related issues including degraded diversity and training instability. Moreover, 3D-aware GANs consider NeRF pipeline as regularizers and do not directly operate with 3D assets, leading to imperfect 3D consistencies. Besides, the independent changes in disentangled editing cannot be ensured due to the sharing of some shallow hidden features in generators. To address these challenges, we propose the first purely diffusion-based three-stage framework for generative and editing tasks, with a series of well-designed loss functions that can directly handle 3D models. In addition, we present a generalizable neural point field as our 3D representation, which explicitly disentangles geometry and appearance in feature spaces. For 3D data conversion, it simplifies the preparation pipeline of datasets. Assisted by the representation, our diffusion model can separately manipulate the shape and appearance in a hierarchical manner by image/text prompts that are provided by the CLIP encoder. Moreover, it can generate new samples by adding a simple generative head. Experiments show that our approach outperforms the SOTA work in the generative tasks of direct generation of 3D representations and novel image synthesis, and completely disentangles the manipulation of shape and appearance with correct semantic correspondence in the editing tasks.en_US
dc.description.number7
dc.description.sectionheadersNeural Rendering
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume42
dc.identifier.doi10.1111/cgf.14930
dc.identifier.issn1467-8659
dc.identifier.pages13 pages
dc.identifier.urihttps://doi.org/10.1111/cgf.14930
dc.identifier.urihttps://diglib.eg.org:443/handle/10.1111/cgf14930
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectCCS Concepts: Computing methodologies -> Shape modeling; Image manipulation
dc.subjectComputing methodologies
dc.subjectShape modeling
dc.subjectImage manipulation
dc.titleLearning to Generate and Manipulate 3D Radiance Field by a Hierarchical Diffusion Framework with CLIP Latenten_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
v42i7_02_14930.pdf
Size:
3.27 MB
Format:
Adobe Portable Document Format
Collections