43 results
Search Results
Now showing 1 - 10 of 43
Item G-Style: Stylized Gaussian Splatting(The Eurographics Association and John Wiley & Sons Ltd., 2024) Kovács, Áron Samuel; Hermosilla, Pedro; Raidou, Renata Georgia; Chen, Renjie; Ritschel, Tobias; Whiting, EmilyWe introduce G -Style, a novel algorithm designed to transfer the style of an image onto a 3D scene represented using Gaussian Splatting. Gaussian Splatting is a powerful 3D representation for novel view synthesis, as-compared to other approaches based on Neural Radiance Fields-it provides fast scene renderings and user control over the scene. Recent pre-prints have demonstrated that the style of Gaussian Splatting scenes can be modified using an image exemplar. However, since the scene geometry remains fixed during the stylization process, current solutions fall short of producing satisfactory results. Our algorithm aims to address these limitations by following a three-step process: In a pre-processing step, we remove undesirable Gaussians with large projection areas or highly elongated shapes. Subsequently, we combine several losses carefully designed to preserve different scales of the style in the image, while maintaining as much as possible the integrity of the original scene content. During the stylization process and following the original design of Gaussian Splatting, we split Gaussians where additional detail is necessary within our scene by tracking the gradient of the stylized color. Our experiments demonstrate that G -Style generates high-quality stylizations within just a few minutes, outperforming existing methods both qualitatively and quantitativelyItem Neural SSS: Lightweight Object Appearance Representation(The Eurographics Association and John Wiley & Sons Ltd., 2024) Tg, Thomson; Tran, Duc Minh; Jensen, Henrik W.; Ramamoorthi, Ravi; Frisvad, Jeppe Revall; Garces, Elena; Haines, EricWe present a method for capturing the BSSRDF (bidirectional scattering-surface reflectance distribution function) of arbitrary geometry with a neural network. We demonstrate how a compact neural network can represent the full 8-dimensional light transport within an object including heterogeneous scattering. We develop an efficient rendering method using importance sampling that is able to render complex translucent objects under arbitrary lighting. Our method can also leverage the common planar half-space assumption, which allows it to represent one BSSRDF model that can be used across a variety of geometries. Our results demonstrate that we can render heterogeneous translucent objects under arbitrary lighting and obtain results that match the reference rendered using volumetric path tracing.Item Robust Diffusion-based Motion In-betweening(The Eurographics Association and John Wiley & Sons Ltd., 2024) Qin, Jia; Yan, Peng; An, Bo; Chen, Renjie; Ritschel, Tobias; Whiting, EmilyThe emergence of learning-based motion in-betweening techniques offers animators a more efficient way to animate characters. However, existing non-generative methods either struggle to support long transition generation or produce results that lack diversity. Meanwhile, diffusion models have shown promising results in synthesizing diverse and high-quality motions driven by text and keyframes. However, in these methods, keyframes often serve as a guide rather than a strict constraint and can sometimes be ignored when keyframes are sparse. To address these issues, we propose a lightweight yet effective diffusionbased motion in-betweening framework that generates animations conforming to keyframe constraints.We incorporate keyframe constraints into the training phase to enhance robustness in handling various constraint densities. Moreover, we employ relative positional encoding to improve the model's generalization on long range in-betweening tasks. This approach enables the model to learn from short animations while generating realistic in-betweening motions spanning thousands of frames. We conduct extensive experiments to validate our framework using the newly proposed metrics K-FID, K-Diversity, and K-Error, designed to evaluate generative in-betweening methods. Results demonstrate that our method outperforms existing diffusion-based methods across various lengths and keyframe densities. We also show that our method can be applied to text-driven motion synthesis, offering fine-grained control over the generated results.Item Real-time Neural Rendering of Dynamic Light Fields(The Eurographics Association and John Wiley & Sons Ltd., 2024) Coomans, Arno; Dominici, Edoardo Alberto; Döring, Christian; Mueller, Joerg H.; Hladky, Jozef; Steinberger, Markus; Bermano, Amit H.; Kalogerakis, EvangelosSynthesising high-quality views of dynamic scenes via path tracing is prohibitively expensive. Although caching offline-quality global illumination in neural networks alleviates this issue, existing neural view synthesis methods are limited to mainly static scenes, have low inference performance or do not integrate well with existing rendering paradigms. We propose a novel neural method that is able to capture a dynamic light field, renders at real-time frame rates at 1920x1080 resolution and integrates seamlessly with Monte Carlo ray tracing frameworks. We demonstrate how a combination of spatial, temporal and a novel surface-space encoding are each effective at capturing different kinds of spatio-temporal signals. Together with a compact fully-fused neural network and architectural improvements, we achieve a twenty-fold increase in network inference speed compared to related methods at equal or better quality. Our approach is suitable for providing offline-quality real-time rendering in a variety of scenarios, such as free-viewpoint video, interactive multi-view rendering, or streaming rendering. Finally, our work can be integrated into other rendering paradigms, e.g., providing a dynamic background for interactive scenarios where the foreground is rendered with traditional methods.Item SENS: Part-Aware Sketch-based Implicit Neural Shape Modeling(The Eurographics Association and John Wiley & Sons Ltd., 2024) Binninger, Alexandre; Hertz, Amir; Sorkine-Hornung, Olga; Cohen-Or, Daniel; Giryes, Raja; Bermano, Amit H.; Kalogerakis, EvangelosWe present SENS, a novel method for generating and editing 3D models from hand-drawn sketches, including those of abstract nature. Our method allows users to quickly and easily sketch a shape, and then maps the sketch into the latent space of a partaware neural implicit shape architecture. SENS analyzes the sketch and encodes its parts into ViT patch encoding, subsequently feeding them into a transformer decoder that converts them to shape embeddings suitable for editing 3D neural implicit shapes. SENS provides intuitive sketch-based generation and editing, and also succeeds in capturing the intent of the user's sketch to generate a variety of novel and expressive 3D shapes, even from abstract and imprecise sketches. Additionally, SENS supports refinement via part reconstruction, allowing for nuanced adjustments and artifact removal. It also offers part-based modeling capabilities, enabling the combination of features from multiple sketches to create more complex and customized 3D shapes. We demonstrate the effectiveness of our model compared to the state-of-the-art using objective metric evaluation criteria and a user study, both indicating strong performance on sketches with a medium level of abstraction. Furthermore, we showcase our method's intuitive sketch-based shape editing capabilities, and validate it through a usability study.Item Learning-based Self-Collision Avoidance in Retargeting using Body Part-specific Signed Distance Fields(The Eurographics Association, 2024) Lee, Junwoo; Kim, Hoimin; Kwon, Taesoo; Chen, Renjie; Ritschel, Tobias; Whiting, EmilyMotion retargeting is a technique for applying the motion of one character to a new character. Differences in shapes and proportions between characters can cause self-collisions during the retargeting process. To address this issue, we propose a new collision resolution strategy comprising three key components: a collision detection module, a self-collision resolution model, and a training strategy for the collision resolution model. The collision detection module generates collision information based on changes in posture. The self-collision resolution model, which is based on a neural network, uses this collision information to resolve self-collisions. The proposed training strategy enhances the performance of the self-collision resolution model. Compared to previous studies, our self-collision resolution process demonstrates superior performance in terms of accuracy and generalization. Our model reduces the average penetration depth across the entire body by 56%, which is 28% better than the previous studies. Additionally, the minimum distance from the end-effectors to the skin averaged 2.65cm, which is more than 0.8cm smaller than in the previous studies. Furthermore, it takes an average of 7.9ms to solve one frame, enabling online real-time self-collision resolution.Item GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction(The Eurographics Association, 2024) Yan, Haodong; Hu, Zhiming; Schmitt, Syn; Bulling, Andreas; Chen, Renjie; Ritschel, Tobias; Whiting, EmilyHuman motion prediction is important for many virtual and augmented reality (VR/AR) applications such as collision avoidance and realistic avatar generation. Existing methods have synthesised body motion only from observed past motion, despite the fact that human eye gaze is known to correlate strongly with body movements and is readily available in recent VR/AR headsets. We present GazeMoDiff - a novel gaze-guided denoising diffusion model to generate stochastic human motions. Our method first uses a gaze encoder and a motion encoder to extract the gaze and motion features respectively, then employs a graph attention network to fuse these features, and finally injects the gaze-motion features into a noise prediction network via a cross-attention mechanism to progressively generate multiple reasonable human motions in the future. Extensive experiments on the MoGaze and GIMO datasets demonstrate that our method outperforms the state-of-the-art methods by a large margin in terms of multi-modal final displacement error (17.3% on MoGaze and 13.3% on GIMO). We further conducted a human study (N=21) and validated that the motions generated by our method were perceived as both more precise and more realistic than those of prior methods. Taken together, these results reveal the significant information content available in eye gaze for stochastic human motion prediction as well as the effectiveness of our method in exploiting this information.Item Neural Volumetric Level of Detail for Path Tracing(The Eurographics Association, 2024) Stadter, Linda; Hofmann, Nikolai; Stamminger, Marc; Linsen, Lars; Thies, JustusWe introduce a neural level of detail pipeline for use in a GPU path tracer based on a sparse volumetric representation derived from neural radiance fields. We pre-compute lighting and occlusion to train a neural radiance field which faithfully captures appearance and shading via image-based optimization. By converting the resulting neural network into an efficiently rendered representation, we eliminate costly evaluations at runtime and keep performance competitive. When applying our representation to certain areas of the scene, we trade a slight bias from gradient-based optimization and lossy volumetric conversion for highly anti-aliased results at low sample counts. This enables virtually noise-free and temporally stable results at low computational cost and without any additional post-processing, such as denoising. We demonstrate the applicability of our method to both individual objects and a challenging outdoor scene composed of highly detailed foliage.Item Practical Method to Estimate Fabric Mechanics from Metadata(The Eurographics Association and John Wiley & Sons Ltd., 2024) Dominguez-Elvira, Henar; Nicás, Alicia; Cirio, Gabriel; Rodríguez, Alejandro; Garces, Elena; Bermano, Amit H.; Kalogerakis, EvangelosEstimating fabric mechanical properties is crucial to create realistic digital twins. Existing methods typically require testing physical fabric samples with expensive devices or cumbersome capture setups. In this work, we propose a method to estimate fabric mechanics just from known manufacturer metadata such as the fabric family, the density, the composition, and the thickness. Further, to alleviate the need to know the fabric family –which might be ambiguous or unknown for nonspecialists– we propose an end-to-end neural method that works with planar images of the textile as input. We evaluate our methods using extensive tests that include the industry standard Cusick and demonstrate that both of them produce drapes that strongly correlate with the ground truth estimates provided by lab equipment. Our method is the first to propose such a simple capture method for mechanical properties outperforming other methods that require testing the fabric in specific setups.Item Audio-Driven Speech Animation with Text-Guided Expression(The Eurographics Association, 2024) Jung, Sunjin; Chun, Sewhan; Noh, Junyong; Chen, Renjie; Ritschel, Tobias; Whiting, EmilyWe introduce a novel method for generating expressive speech animations of a 3D face, driven by both audio and text descriptions. Many previous approaches focused on generating facial expressions using pre-defined emotion categories. In contrast, our method is capable of generating facial expressions from text descriptions unseen during training, without limitations to specific emotion classes. Our system employs a two-stage approach. In the first stage, an auto-encoder is trained to disentangle content and expression features from facial animations. In the second stage, two transformer-based networks predict the content and expression features from audio and text inputs, respectively. These features are then passed to the decoder of the pre-trained auto-encoder, yielding the final expressive speech animation. By accommodating diverse forms of natural language, such as emotion words or detailed facial expression descriptions, our method offers an intuitive and versatile way to generate expressive speech animations. Extensive quantitative and qualitative evaluations, including a user study, demonstrate that our method can produce natural expressive speech animations that correspond to the input audio and text descriptions.