6 results
Search Results
Now showing 1 - 6 of 6
Item Learning Dynamic 3D Geometry and Texture for Video Face Swapping(The Eurographics Association and John Wiley & Sons Ltd., 2022) Otto, Christopher; Naruniec, Jacek; Helminger, Leonhard; Etterlin, Thomas; Mignone, Graziana; Chandran, Prashanth; Zoss, Gaspard; Schroers, Christopher; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Weber, Romann; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneFace swapping is the process of applying a source actor's appearance to a target actor's performance in a video. This is a challenging visual effect that has seen increasing demand in film and television production. Recent work has shown that datadriven methods based on deep learning can produce compelling effects at production quality in a fraction of the time required for a traditional 3D pipeline. However, the dominant approach operates only on 2D imagery without reference to the underlying facial geometry or texture, resulting in poor generalization under novel viewpoints and little artistic control. Methods that do incorporate geometry rely on pre-learned facial priors that do not adapt well to particular geometric features of the source and target faces. We approach the problem of face swapping from the perspective of learning simultaneous convolutional facial autoencoders for the source and target identities, using a shared encoder network with identity-specific decoders. The key novelty in our approach is that each decoder first lifts the latent code into a 3D representation, comprising a dynamic face texture and a deformable 3D face shape, before projecting this 3D face back onto the input image using a differentiable renderer. The coupled autoencoders are trained only on videos of the source and target identities, without requiring 3D supervision. By leveraging the learned 3D geometry and texture, our method achieves face swapping with higher quality than when using offthe- shelf monocular 3D face reconstruction, and overall lower FID score than state-of-the-art 2D methods. Furthermore, our 3D representation allows for efficient artistic control over the result, which can be hard to achieve with existing 2D approaches.Item Deep Reconstruction of 3D Smoke Densities from Artist Sketches(The Eurographics Association and John Wiley & Sons Ltd., 2022) Kim, Byungsoo; Huang, Xingchang; Wuelfroth, Laura; Tang, Jingwei; Cordonnier, Guillaume; Gross, Markus; Solenthaler, Barbara; Chaine, Raphaƫlle; Kim, Min H.Creative processes of artists often start with hand-drawn sketches illustrating an object. Pre-visualizing these keyframes is especially challenging when applied to volumetric materials such as smoke. The authored 3D density volumes must capture realistic flow details and turbulent structures, which is highly non-trivial and remains a manual and time-consuming process. We therefore present a method to compute a 3D smoke density field directly from 2D artist sketches, bridging the gap between early-stage prototyping of smoke keyframes and pre-visualization. From the sketch inputs, we compute an initial volume estimate and optimize the density iteratively with an updater CNN. Our differentiable sketcher is embedded into the end-to-end training, which results in robust reconstructions. Our training data set and sketch augmentation strategy are designed such that it enables general applicability. We evaluate the method on synthetic inputs and sketches from artists depicting both realistic smoke volumes and highly non-physical smoke shapes. The high computational performance and robustness of our method at test time allows interactive authoring sessions of volumetric density fields for rapid prototyping of ideas by novice users.Item Facial Animation with Disentangled Identity and Motion using Transformers(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Dominik L. Michels; Soeren PirkWe propose a 3D+time framework for modeling dynamic sequences of 3D facial shapes, representing realistic non-rigid motion during a performance. Our work extends neural 3D morphable models by learning a motion manifold using a transformer architecture. More specifically, we derive a novel transformer-based autoencoder that can model and synthesize 3D geometry sequences of arbitrary length. This transformer naturally determines frame-to-frame correlations required to represent the motion manifold, via the internal self-attention mechanism. Furthermore, our method disentangles the constant facial identity from the time-varying facial expressions in a performance, using two separate codes to represent neutral identity and the performance itself within separate latent subspaces. Thus, the model represents identity-agnostic performances that can be paired with an arbitrary new identity code and fed through our new identity-modulated performance decoder; the result is a sequence of 3D meshes for the performance with the desired identity and temporal length. We demonstrate how our disentangled motion model has natural applications in performance synthesis, performance retargeting, key-frame interpolation and completion of missing data, performance denoising and retiming, and other potential applications that include full 3D body modeling.Item Differentiable Simulation for Outcome-Driven Orthognathic Surgery Planning(The Eurographics Association and John Wiley & Sons Ltd., 2022) Dorda, Daniel; Peter, Daniel; Borer, Dominik; Huber, Niko Benjamin; Sailer, Irena; Gross, Markus; Solenthaler, Barbara; Thomaszewski, Bernhard; Dominik L. Michels; Soeren PirkAlgorithms at the intersection of computer graphics and medicine have recently gained renewed attention. A particular interest are methods for virtual surgery planning (VSP), where treatment parameters must be carefully chosen to achieve a desired treatment outcome. FEM simulators can verify the treatment parameters by comparing a predicted outcome to the desired one. However, estimating the optimal parameters amounts to solving a challenging inverse problem. In current clinical practice it is solved manually by surgeons, who rely on their experience and intuition to iteratively refine the parameters, verifying them with simulated predictions. We prototype a differentiable FEM simulator and explore how it can enhance and simplify treatment planning, which is ultimately necessary to integrate simulation-based VSP tools into a clinical workflow. Specifically, we define a parametric treatment model based on surgeon input, and with analytically derived simulation gradients we optimise it against an objective defined on the visible facial 3D surface. By using sensitivity analysis, we can easily explore the solution-space with first-order approximations, which allow the surgeon to interactively visualise the effect of parameter variations on a given treatment plan. The objective function allows landmarks to be freely chosen, accommodating the multiple methodologies in clinical planning. We show that even with a very sparse set of guiding landmarks, our simulator robustly converges to a feasible post-treatment shape.Item Shape Transformers: Topology-Independent 3D Shape Models Using Transformers(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Chaine, Raphaƫlle; Kim, Min H.Parametric 3D shape models are heavily utilized in computer graphics and vision applications to provide priors on the observed variability of an object's geometry (e.g., for faces). Original models were linear and operated on the entire shape at once. They were later enhanced to provide localized control on different shape parts separately. In deep shape models, nonlinearity was introduced via a sequence of fully-connected layers and activation functions, and locality was introduced in recent models that use mesh convolution networks. As common limitations, these models often dictate, in one way or another, the allowed extent of spatial correlations and also require that a fixed mesh topology be specified ahead of time. To overcome these limitations, we present Shape Transformers, a new nonlinear parametric 3D shape model based on transformer architectures. A key benefit of this new model comes from using the transformer's self-attention mechanism to automatically learn nonlinear spatial correlations for a class of 3D shapes. This is in contrast to global models that correlate everything and local models that dictate the correlation extent. Our transformer 3D shape autoencoder is a better alternative to mesh convolution models, which require specially-crafted convolution, and down/up-sampling operators that can be difficult to design. Our model is also topologically independent: it can be trained once and then evaluated on any mesh topology, unlike most previous methods. We demonstrate the application of our model to different datasets, including 3D faces, 3D hand shapes and full human bodies. Our experiments demonstrate the strong potential of our Shape Transformer model in several applications in computer graphics and vision.Item Automatic Feature Selection for Denoising Volumetric Renderings(The Eurographics Association and John Wiley & Sons Ltd., 2022) Zhang, Xianyao; Ott, Melvin; Manzi, Marco; Gross, Markus; Papas, Marios; Ghosh, Abhijeet; Wei, Li-YiWe propose a method for constructing feature sets that significantly improve the quality of neural denoisers for Monte Carlo renderings with volumetric content. Starting from a large set of hand-crafted features, we propose a feature selection process to identify significantly pruned near-optimal subsets. While a naive approach would require training and testing a separate denoiser for every possible feature combination, our selection process requires training of only a single probe denoiser for the selection task. Moreover, our approximate solution has an asymptotic complexity that is quadratic to the number of features compared to the exponential complexity of the naive approach, while also producing near-optimal solutions. We demonstrate the usefulness of our approach on various state-of-the-art denoising methods for volumetric content. We observe improvements in denoising quality when using our automatically selected feature sets over the hand-crafted sets proposed by the original methods.