A Controllable Appearance Representation for Flexible Transfer and Editing

Jimenez-Navarro, SantiagoGuerrero-Viu, JuliaMasia, BelenWang, BeibeiWilkie, Alexander2025-06-202025-06-202025978-3-03868-292-91727-3463https://doi.org/10.2312/sr.20251187https://diglib.eg.org/handle/10.2312/sr20251187We present a method that computes an interpretable representation of material appearance within a highly compact, disentangled latent space. This representation is learned in a self-supervised fashion using a VAE-based model. We train our model with a carefully designed unlabeled dataset, avoiding possible biases induced by human-generated labels. Our model demonstrates strong disentanglement and interpretability by effectively encoding material appearance and illumination, despite the absence of explicit supervision. To showcase the capabilities of such a representation, we leverage it for two proof-of-concept applications: image-based appearance transfer and editing. Our representation is used to condition a diffusion pipeline that transfers the appearance of one or more images onto a target geometry, and allows the user to further edit the resulting appearance. This approach offers fine-grained control over the generated results: thanks to the well-structured compact latent space, users can intuitively manipulate attributes such as hue or glossiness in image space to achieve the desired final appearance.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies -> Appearance and texture representations; Latent representations; material appearance; self-supervised learningComputing methodologiesAppearance and texture representationsLatent representationsmaterial appearanceselfsupervised learningA Controllable Appearance Representation for Flexible Transfer and Editing10.2312/sr.2025118713 pages