3 results
Search Results
Now showing 1 - 3 of 3
Item MoNeRF: Deformable Neural Rendering for Talking Heads via Latent Motion Navigation(Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd., 2024) Li, X.; Ding, Y.; Li, R.; Tang, Z.; Li, K.Novel view synthesis for talking heads presents significant challenges due to the complex and diverse motion transformations involved. Conventional methods often resort to reliance on structure priors, like facial templates, to warp observed images into a canonical space conducive to rendering. However, the incorporation of such priors introduces a trade‐off‐while aiding in synthesis, they concurrently amplify model complexity, limiting generalizability to other deformable scenes. Departing from this paradigm, we introduce a pioneering solution: the motion‐conditioned neural radiance field, MoNeRF, designed to model talking heads through latent motion navigation. At the core of MoNeRF lies a novel approach utilizing a compact set of latent codes to represent orthogonal motion directions. This innovative strategy empowers MoNeRF to efficiently capture and depict intricate scene motion by linearly combining these latent codes. In an extended capability, MoNeRF facilitates motion control through latent code adjustments, supports view transfer based on reference videos, and seamlessly extends its applicability to model human bodies without necessitating structural modifications. Rigorous quantitative and qualitative experiments unequivocally demonstrate MoNeRF's superior performance compared to state‐of‐the‐art methods in talking head synthesis. We will release the source code upon publication.Item Deep‐Learning‐Based Facial Retargeting Using Local Patches(Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd., 2024) Choi, Yeonsoo; Lee, Inyup; Cha, Sihun; Kim, Seonghyeon; Jung, Sunjin; Noh, JunyongIn the era of digital animation, the quest to produce lifelike facial animations for virtual characters has led to the development of various retargeting methods. While the retargeting facial motion between models of similar shapes has been very successful, challenges arise when the retargeting is performed on stylized or exaggerated 3D characters that deviate significantly from human facial structures. In this scenario, it is important to consider the target character's facial structure and possible range of motion to preserve the semantics assumed by the original facial motions after the retargeting. To achieve this, we propose a local patch‐based retargeting method that transfers facial animations captured in a source performance video to a target stylized 3D character. Our method consists of three modules. The Automatic Patch Extraction Module extracts local patches from the source video frame. These patches are processed through the Reenactment Module to generate correspondingly re‐enacted target local patches. The Weight Estimation Module calculates the animation parameters for the target character at every frame for the creation of a complete facial animation sequence. Extensive experiments demonstrate that our method can successfully transfer the semantic meaning of source facial expressions to stylized characters with considerable variations in facial feature proportion.Item Conditional Font Generation With Content Pre‐Train and Style Filter(Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd., 2024) Hong, Yang; Li, Yinfei; Qiao, Xiaojun; Zhang, JunsongAutomatic font generation aims to streamline the design process by creating new fonts with minimal style references. This technology significantly reduces the manual labour and costs associated with traditional font design. Image‐to‐image translation has been the dominant approach, transforming font images from a source style to a target style using a few reference images. However, this framework struggles to fully decouple content from style, particularly when dealing with significant style shifts. Despite these limitations, image‐to‐image translation remains prevalent due to two main challenges faced by conditional generative models: (1) inability to handle unseen characters and (2) difficulty in providing precise content representations equivalent to the source font. Our approach tackles these issues by leveraging recent advancements in Chinese character representation research to pre‐train a robust content representation model. This model not only handles unseen characters but also generalizes to non‐existent ones, a capability absent in traditional image‐to‐image translation. We further propose a Transformer‐based Style Filter that not only accurately captures stylistic features from reference images but also handles any combination of them, fostering greater convenience for practical automated font generation applications. Additionally, we incorporate content loss with commonly used pixel‐ and perceptual‐level losses to refine the generated results from a comprehensive perspective. Extensive experiments validate the effectiveness of our method, particularly its ability to handle unseen characters, demonstrating significant performance gains over existing state‐of‐the‐art methods.