44-Issue 2

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607133

Browse

Now showing 1 - 20 of 75

Learning Image Fractals Using Chaotic Differentiable Point Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Djeacoumar, Adarsh; Mujkanovic, Felix; Seidel, Hans-Peter; Leimkühler, Thomas; Bousseau, Adrien; Day, Angela
Fractal geometry, defined by self-similar patterns across scales, is crucial for understanding natural structures. This work addresses the fractal inverse problem, which involves extracting fractal codes from images to explain these patterns and synthesize them at arbitrary finer scales. We introduce a novel algorithm that optimizes Iterated Function System parameters using a custom fractal generator combined with differentiable point splatting. By integrating both stochastic and gradient-based optimization techniques, our approach effectively navigates the complex energy landscapes typical of fractal inversion, ensuring robust performance and the ability to escape local minima. We demonstrate the method's effectiveness through comparisons with various fractal inversion techniques, highlighting its ability to recover high-quality fractal codes and perform extensive zoom-ins to reveal intricate patterns from just a single image.
Multi-Modal Instrument Performances (MMIP): A Musical Database
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kyriakou, Theodoros; Aristidou, Andreas; Charalambous, Panayiotis; Bousseau, Adrien; Day, Angela
Musical instrument performances are multimodal creative art forms that integrate audiovisual elements, resulting from musicians' interactions with instruments through body movements, finger actions, and facial expressions. Digitizing such performances for archiving, streaming, analysis, or synthesis requires capturing every element that shapes the overall experience, which is crucial for preserving the performance's essence. In this work, following current trends in large-scale dataset development for deep learning analysis and generative models, we introduce the Multi-Modal Instrument Performances (MMIP) database (https://mmip.cs.ucy.ac.cy). This is the first dataset to incorporate synchronized high-quality 3D motion capture data for the body, fingers, facial expressions, and instruments, along with audio, multi-angle videos, and MIDI data. The database currently includes 3.5 hours of performances featuring three instruments: guitar, piano, and drums. Additionally, we discuss the challenges of acquiring these multi-modal data, detailing our approach to data collection, signal synchronization, annotation, and metadata management. Our data formats align with industry standards for ease of use, and we have developed an open-access online repository that offers a user-friendly environment for data exploration, supporting data organization, search capabilities, and custom visualization tools. Notable features include a MIDI-to-instrument animation project for visualizing the instruments and a script for playing back FBX files with synchronized audio in a web environment.
Learning Metric Fields for Fast Low-Distortion Mesh Parameterizations
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fargion, Guy; Weber, Ofir; Bousseau, Adrien; Day, Angela
We present a fast and robust method for computing an injective parameterization with low isometric distortion for disk-like triangular meshes. Harmonic function-based methods, with their rich mathematical foundation, are widely used. Harmonic maps are particularly valuable for ensuring injectivity under certain boundary conditions. In addition, they offer computational efficiency by forming a linear subspace [FW22]. However, this restricted subspace often leads to significant isometric distortion, especially for highly curved surfaces. Conversely, methods that operate in the full space of piecewise linear maps [SPSH∗17] achieve lower isometric distortion, but at a higher computational cost. Aigerman et al. [AGK∗22] pioneered a parameterization method that uses deep neural networks to predict the Jacobians of the map at mesh triangles, and integrates them into an explicit map by solving a Poisson equation. However, this approach often results in significant Poisson reconstruction errors due to the inability to ensure the integrability of the predicted neural Jacobian field, leading to unbounded distortion and lack of local injectivity. We propose a hybrid method that combines the speed and robustness of harmonic maps with the generality of deep neural networks to produce injective maps with low isometric distortion much faster than state-of-the-art methods. The core concept is simple but powerful. Instead of learning Jacobian fields, we learn metric tensor fields over the input mesh, resulting in a customized Laplacian matrix that defines a harmonic map in a modified metric [WGS23]. Our approach ensures injectivity, offers great computational efficiency, and produces significantly lower isometric distortion compared to straightforward harmonic maps.
"Wild West" of Evaluating Speech-Driven 3D Facial Animation Synthesis: A Benchmark Study
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Haque, Kazi Injamamul; Pavlou, Alkiviadis; Yumak, Zerrin; Bousseau, Adrien; Day, Angela
Recent advancements in the field of audio-driven 3D facial animation have accelerated rapidly, with numerous papers being published in a short span of time. This surge in research has garnered significant attention from both academia and industry with its potential applications on digital humans. Various approaches, both deterministic and non-deterministic, have been explored based on foundational advancements in deep learning algorithms. However, there remains no consensus among researchers on standardized methods for evaluating these techniques. Additionally, rather than converging on a common set of datasets and objective metrics suited for specific methods, recent works exhibit considerable variation in experimental setups. This inconsistency complicates the research landscape, making it difficult to establish a streamlined evaluation process and rendering many cross-paper comparisons challenging. Moreover, the common practice of A/B testing in perceptual studies focus only on two common metrics and not sufficient for non-deterministic and emotion-enabled approaches. The lack of correlations between subjective and objective metrics points out that there is a need for critical analysis in this space. In this study, we address these issues by benchmarking state-of-the-art deterministic and non-deterministic models, utilizing a consistent experimental setup across a carefully curated set of objective metrics and datasets. We also conduct a perceptual user study to assess whether subjective perceptual metrics align with the objective metrics. Our findings indicate that model rankings do not necessarily generalize across datasets, and subjective metric ratings are not always consistent with their corresponding objective metrics. The supplementary video, edited code scripts for training on different datasets and documentation related to this benchmark study are made publicly available- https://galib360.github.io/face-benchmark-project/.
4-LEGS: 4D Language Embedded Gaussian Splatting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fiebelman, Gal; Cohen, Tamir; Morgenstern, Ayellet; Hedman, Peter; Averbuch-Elor, Hadar; Bousseau, Adrien; Day, Angela
The emergence of neural representations has revolutionized our means for digitally viewing a wide range of 3D scenes, enabling the synthesis of photorealistic images rendered from novel views. Recently, several techniques have been proposed for connecting these low-level representations with the high-level semantics understanding embodied within the scene. These methods elevate the rich semantic understanding from 2D imagery to 3D representations, distilling high-dimensional spatial features onto 3D space. In our work, we are interested in connecting language with a dynamic modeling of the world. We show how to lift spatio-temporal features to a 4D representation based on 3D Gaussian Splatting. This enables an interactive interface where the user can spatiotemporally localize events in the video from text prompts. We demonstrate our system on public 3D video datasets of people and animals performing various actions.
Corotational Hinge-based Thin Plates/Shells
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Liang, Qixin; Bousseau, Adrien; Day, Angela
We present six thin plate/shell models, derived from three distinct types of curvature operators formulated within the corotational frame, for simulating both rest-flat and rest-curved triangular meshes. Each curvature operator derives a curvature expression corresponding to both a plate model and a shell model. The corotational edge-based hinge model uses an edge-based stencil to compute directional curvature, while the corotational FVM hinge model utilizes a triangle-centered stencil, applying the finite volume method (FVM) to superposition directional curvatures across edges, yielding a generalized curvature. The corotational smoothed hinge model also employs a triangle-centered stencil but transforms directional curvatures into a generalized curvature based on a quadratic surface fit. All models assume small strain and small curvature, leading to constant bending energy Hessians, which benefit implicit integrators. Through quantitative benchmarks and qualitative elastodynamic simulations with large time steps, we demonstrate the accuracy, efficiency, and stability of these models. Our contributions enhance the thin plate/shell library for use in both computer graphics and engineering applications.
EUROGRAPHICS 2025: CGF 44-2 Frontmatter
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Dai, Angela; Bousseau, Adrien; Dai, Angela; Bousseau, Adrien
Approximating Procedural Models of 3D Shapes with Neural Networks
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hossain, Ishtiaque; Shen, I-Chao; Kaick, Oliver van; Bousseau, Adrien; Day, Angela
Procedural modeling is a popular technique for 3D content creation and offers a number of advantages over alternative techniques for modeling 3D shapes. However, given a procedural model, predicting the procedural parameters of existing data provided in different modalities can be challenging. This is because the data may be in a different representation than the one generated by the procedural model, and procedural models are usually not invertible, nor are they differentiable. In this paper, we address these limitations and introduce an invertible and differentiable representation for procedural models. We approximate parameterized procedures with a neural network architecture NNProc that learns both the forward and inverse mapping of the procedural model by aligning the latent spaces of shape parameters and shapes. The network is trained in a manner that is agnostic to the inner workings of the procedural model, implying that models implemented in different languages or systems can be used. We demonstrate how the proposed representation can be used for both forward and inverse procedural modeling. Moreover, we show how NNProc can be used in conjunction with optimization for applications such as shape reconstruction from an image or a 3D Gaussian Splatting.
Cloth Animation with Time-dependent Persistent Wrinkles
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Gong, Deshan; Yang, Yin; Shao, Tianjia; Wang, He; Bousseau, Adrien; Day, Angela
Persistent wrinkles are often observed on crumpled garments e.g., the wrinkles around the knees after sitting for a while. Such wrinkles can be easily recovered if not deformed for long, and otherwise be persistent. Since they are vital to the visual realism of cloth animation, we aim to simulate realistic looking persistent wrinkles. To this end, we present a physics-inspired finegrained wrinkle model. Different from existing methods, we recognize the importance of the interplay between internal friction and plasticity during wrinkle formation. Furthermore, we model their time dependence for persistent wrinkles. Our model is capable of not only simulating realistic wrinkle patterns, but also their time-dependent changes according to how long the deformation is maintained. Through extensive experiments, we show that our model is effective in simulating realistic spatial and temporal varying wrinkles, versatile in simulating different materials, and capable of generating more fine-grained wrinkles than the state of the art.
CEDRL: Simulating Diverse Crowds with Example-Driven Deep Reinforcement Learning
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Panayiotou, Andreas; Aristidou, Andreas; Charalambous, Panayiotis; Bousseau, Adrien; Day, Angela
The level of realism in virtual crowds is strongly affected by the presence of diverse crowd behaviors. In real life, we can observe various scenarios, ranging from pedestrians moving on a shopping street, people talking in static groups, or wandering around in a public park. Most of the existing systems optimize for specific behaviors such as goal-seeking and collision avoidance, neglecting to consider other complex behaviors that are usually challenging to capture or define. Departing from the conventional use of Supervised Learning, which requires vast amounts of labeled data and often lacks controllability, we introduce Crowds using Example-driven Deep Reinforcement Learning (CEDRL), a framework that simultaneously leverages multiple crowd datasets to model a broad spectrum of human behaviors. This approach enables agents to adaptively learn and exhibit diverse behaviors, enhancing their ability to generalize decisions across unseen states. The model can be applied to populate novel virtual environments while providing real-time controllability over the agents' behaviors. We achieve this through the design of a reward function aligned with real-world observations and by employing curriculum learning that gradually diminishes the agents' observation space. A complexity characterization metric defines each agent's high-level crowd behavior, linking it to the agent's state and serving as an input to the policy network. Additionally, a parametric reward function, influenced by the type of crowd task, facilitates the learning of a diverse and abstract behavior ''skill'' set. We evaluate our model on both training and unseen real-world data, comparing against other simulators, showing its ability to generalize across scenarios and accurately reflect the observed complexity of behaviors. We also examine our system's controllability by adjusting the complexity weight, discovering that higher values lead to more complex behaviors such as wandering, static interactions, and group dynamics like joining or leaving. Finally, we demonstrate our model's capabilities in novel synthetic scenarios.
ASMR: Adaptive Skeleton-Mesh Rigging and Skinning via 2D Generative Prior
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hong, Seokhyeon; Choi, Soojin; Kim, Chaelin; Cha, Sihun; Noh, Junyong; Bousseau, Adrien; Day, Angela
Despite the growing accessibility of skeletal motion data, integrating it for animating character meshes remains challenging due to diverse configurations of both skeletons and meshes. Specifically, the body scale and bone lengths of the skeleton should be adjusted in accordance with the size and proportions of the mesh, ensuring that all joints are accurately positioned within the character mesh. Furthermore, defining skinning weights is complicated by variations in skeletal configurations, such as the number of joints and their hierarchy, as well as differences in mesh configurations, including their connectivity and shapes. While existing approaches have made efforts to automate this process, they hardly address the variations in both skeletal and mesh configurations. In this paper, we present a novel method for the automatic rigging and skinning of character meshes using skeletal motion data, accommodating arbitrary configurations of both meshes and skeletons. The proposed method predicts the optimal skeleton aligned with the size and proportion of the mesh as well as defines skinning weights for various meshskeleton configurations, without requiring explicit supervision tailored to each of them. By incorporating Diffusion 3D Features (Diff3F) as semantic descriptors of character meshes, our method achieves robust generalization across different configurations. To assess the performance of our method in comparison to existing approaches, we conducted comprehensive evaluations encompassing both quantitative and qualitative analyses, specifically examining the predicted skeletons, skinning weights, and deformation quality.
A Multimodal Personality Prediction Framework based on Adaptive Graph Transformer Network and Multi-task Learning
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wang, Rongquan; Zhao, Xile; Xu, Xianyu; Hao, Yang; Bousseau, Adrien; Day, Angela
Multimodal personality analysis targets accurately detecting personality traits by incorporating related multimodal information. However, existing methods focus on unimodal features while overlooking the bimodal association features crucial for this interdisciplinary task. Therefore, we propose a multimodal personality prediction framework based on an adaptive graph transformer network and multi-task learning. Firstly, we utilize pre-trained models to learn specific representations from different modalities. Here, we employ pre-trained multimodal models' encoders as the backbones of the modality-specific extraction methods to mine unimodal features. Specifically, we introduce a novel adaptive graph transformer network to mine personalityrelated bimodal association features. This network effectively learns higher-order temporal dependencies based on relational graphs and emphasizes more significant features. Furthermore, we utilize a multimodal channel attention residual fusion module to obtain the fused features, and we propose a multimodal and unimodal joint learning regression head to learn and predict scores for personality traits. We design a multi-task loss function to enhance the robustness and accuracy of personality prediction. Experimental results on the two benchmark datasets demonstrate the effectiveness of our framework, which outperforms the state-of-the-art methods. The code is available at https://github.com/RongquanWang/PPF-AGTNMTL.
VortexTransformer: End-to-End Objective Vortex Detection in 2D Unsteady Flow Using Transformers
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Zhang, Xingdi; Rautek, Peter; Hadwiger, Markus; Bousseau, Adrien; Day, Angela
Vortex structures play a pivotal role in understanding complex fluid dynamics, yet defining them rigorously remains challenging. One hard criterion is that a vortex detector must be objective, i.e., it needs to be indifferent to reference frame transformations. We propose VortexTransformer, a novel deep learning approach using point transformer architectures to directly extract vortex structures from pathlines. Unlike traditional methods that rely on grid-based velocity fields in the Eulerian frame, our approach operates entirely on a Lagrangian representation of the flow field (i.e., pathlines), enabling objective identification of both strong and weak vortex structures. To train VortexTransformer, we generate a large synthetic dataset using parametric flow models to simulate diverse vortex configurations, ensuring a robust ground truth. We compare our method against CNN and UNet architectures, applying the trained models to real-world flow datasets. VortexTransformer is an end-to-end detector, which means that reference frame transformations as well as vortex detection are handled implicitly by the network, demonstrating the ability to extract vortex boundaries without the need for parameters such as arbitrary thresholds, or an explicit definition of a vortex. Our method offers a new approach to determining objective vortex labels by using the objective pairwise distances of material points for vortex detection and is adaptable to various flow conditions.
FlairGPT: Repurposing LLMs for Interior Designs
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Littlefair, Gabrielle; Dutt, Niladri Shekhar; Mitra, Niloy J.; Bousseau, Adrien; Day, Angela
Interior design involves the careful selection and arrangement of objects to create an aesthetically pleasing, functional, and harmonized space that aligns with the client's design brief. This task is particularly challenging, as a successful design must not only incorporate all the necessary objects in a cohesive style, but also ensure they are arranged in a way that maximizes accessibility, while adhering to a variety of affordability and usage considerations. Data-driven solutions have been proposed, but these are typically room- or domain-specific and lack explainability in their design design considerations used in producing the final layout. In this paper, we investigate if large language models (LLMs) can be directly utilized for interior design. While we find that LLMs are not yet capable of generating complete layouts, they can be effectively leveraged in a structured manner, inspired by the workflow of interior designers. By systematically probing LLMs, we can reliably generate a list of objects along with relevant constraints that guide their placement. We translate this information into a design layout graph, which is then solved using an off-the-shelf constrained optimization setup to generate the final layouts. We benchmark our algorithm in various design configurations against existing LLM-based methods and human designs, and evaluate the results using a variety of quantitative and qualitative metrics along with user studies. In summary, we demonstrate that LLMs, when used in a structured manner, can effectively generate diverse high-quality layouts, making them a viable solution for creating large-scale virtual scenes. Code is available via the project webpage.
SOBB: Skewed Oriented Bounding Boxes for Ray Tracing
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kácerik, Martin; Bittner, Jirí; Bousseau, Adrien; Day, Angela
We propose skewed oriented bounding boxes (SOBB) as a novel bounding primitive for accelerating the calculation of rayscene intersections. SOBBs have the same memory footprint as the well-known oriented bounding boxes (OBB) and can be used with a similar ray intersection algorithm. We propose an efficient algorithm for constructing a BVH with SOBBs, using a transformation from a standard BVH built for axis-aligned bounding boxes (AABB). We use discrete orientation polytopes as a temporary bounding representation to find tightly fitting SOBBs. Additionally, we propose a compression scheme for SOBBs that makes their memory requirements comparable to those of AABBs. For secondary rays, the SOBB BVH provides a ray tracing speedup of 1.0-11.0x over the AABB BVH and it is 1.1x faster than the OBB BVH on average. The transformation of AABB BVH to SOBB BVH is, on average, 2.6x faster than the ditetrahedron-based AABB BVH to OBB BVH transformation.
Material Transforms from Disentangled NeRF Representations
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Lopes, Ivan; Lalonde, Jean-François; Charette, Raoul de; Bousseau, Adrien; Day, Angela
In this paper, we first propose a novel method for transferring material transformations across different scenes. Building on disentangled Neural Radiance Field (NeRF) representations, our approach learns to map Bidirectional Reflectance Distribution Functions (BRDF) from pairs of scenes observed in varying conditions, such as dry and wet. The learned transformations can then be applied to unseen scenes with similar materials, therefore effectively rendering the transformation learned with an arbitrary level of intensity. Extensive experiments on synthetic scenes and real-world objects validate the effectiveness of our approach, showing that it can learn various transformations such as wetness, painting, coating, etc. Our results highlight not only the versatility of our method but also its potential for practical applications in computer graphics. We publish our method implementation, along with our synthetic/real datasets on https://github.com/astra-vision/BRDFTransform
Image Vectorization via Gradient Reconstruction
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Chakraborty, Souymodip; Batra, Vineet; Phogat, Ankit; Jain, Vishwas; Ranawat, Jaswant Singh; Dhingra, Sumit; Wampler, Kevin; Fisher, Matthew; Lukác, Michal; Bousseau, Adrien; Day, Angela
We present a fully automated technique that segments raster images into smooth shaded regions and reconstructs them using an optimal mix of solid fills, linear gradients, and radial gradients. Our method leverages a novel discontinuity-aware segmentation strategy and gradient reconstruction algorithm to accurately capture intricate shading details and produce compact Bézier curve representations. Extensive evaluations on both designer-created art and generative images demonstrate that our approach achieves high visual fidelity with minimal geometric complexity and fast processing times. This work offers a robust and versatile solution for converting detailed raster images into scalable vector graphics, addressing the evolving needs of modern design workflows.
NoPe-NeRF++: Local-to-Global Optimization of NeRF with No Pose Prior
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Shi, Dongbo; Cao, Shen; Wu, Bojian; Guo, Jinhui; Fan, Lubin; Chen, Renjie; Liu, Ligang; Ye, Jieping; Bousseau, Adrien; Day, Angela
In this paper, we introduce NoPe-NeRF++, a novel local-to-global optimization algorithm for training Neural Radiance Fields (NeRF) without requiring pose priors. Existing methods, particularly NoPe-NeRF, which focus solely on the local relationships within images, often struggle to recover accurate camera poses in complex scenarios. To overcome the challenges, our approach begins with a relative pose initialization with explicit feature matching, followed by a local joint optimization to enhance the pose estimation for training a more robust NeRF representation. This method significantly improves the quality of initial poses. Additionally, we introduce global optimization phase that incorporates geometric consistency constraints through bundle adjustment, which integrates feature trajectories to further refine poses and collectively boost the quality of NeRF. Notably, our method is the first work that seamlessly combines the local and global cues with NeRF, and outperforms state-of-the-art methods in both pose estimation accuracy and novel view synthesis. Extensive evaluations on benchmark datasets demonstrate our superior performance and robustness, even in challenging scenes, thus validating our design choices.
Inverse Simulation of Radiative Thermal Transport
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Freude, Christian; Lipp, Lukas; Zezulka, Matthias; Rist, Florian; Wimmer, Michael; Hahn, David; Bousseau, Adrien; Day, Angela
The early phase of urban planning and architectural design has a great impact on the thermal loads and characteristics of constructed buildings. It is, therefore, important to efficiently simulate thermal effects early on and rectify possible problems. In this paper, we present an inverse simulation of radiative heat transport and a differentiable photon-tracing approach. Our method utilizes GPU-accelerated ray tracing to speed up both the forward and adjoint simulation. Moreover, we incorporate matrix compression to further increase the efficiency of our thermal solver and support larger scenes. In addition to our differentiable photon-tracing approach, we introduce a novel approximate edge sampling scheme that re-uses primary samples instead of relying on explicit edge samples or auxiliary rays to resolve visibility discontinuities. Our inverse simulation system enables designers to not only predict the temperature distribution, but also automatically optimize the design to improve thermal comfort and avoid problematic configurations. We showcase our approach using several examples in which we optimize the placement of buildings or their facade geometry. Our approach can be used to optimize arbitrary geometric parameterizations and supports steady-state, as well as transient simulations.
HPRO: Direct Visibility of Point Clouds for Optimization
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Katz, Sagi; Tal, Ayellet; Bousseau, Adrien; Day, Angela
Given a point cloud, which is assumed to be a sampling of a continuous surface, and a viewpoint, which points are visible from that viewpoint? Since points do not occlude each other, the real question is which points would be visible if the surface they were sampled from were known. While an existing approximation method addresses this problem, it is unsuitable for use in optimization processes or learning models due to its lack of differentiability. To overcome this limitation, the paper introduces a novel differentiable approximation method. It is based on identifying the extreme points of a point set in a differentiable manner. This approach can be effectively integrated into optimization algorithms or used as a layer in neural networks, allowing for the computation and utilization of visible points in various tasks, such as optimal viewpoint selection. The paper also provides theoretical proofs of the operator's correctness in the limit, further validating its effectiveness. The code is available at https://github.com/sagikatz/HPRO

Browse

Browsing 44-Issue 2 by Issue Date

Results Per Page

Sort Options