Search Results

Now showing 1 - 10 of 14
  • Item
    A Survey on Reinforcement Learning Methods in Character Animation
    (The Eurographics Association and John Wiley & Sons Ltd., 2022) Kwiatkowski, Ariel; Alvarado, Eduardo; Kalogeiton, Vicky; Liu, C. Karen; Pettré, Julien; Panne, Michiel van de; Cani, Marie-Paule; Meneveaux, Daniel; Patanè, Giuseppe
    Reinforcement Learning is an area of Machine Learning focused on how agents can be trained to make sequential decisions, and achieve a particular goal within an arbitrary environment. While learning, they repeatedly take actions based on their observation of the environment, and receive appropriate rewards which define the objective. This experience is then used to progressively improve the policy controlling the agent's behavior, typically represented by a neural network. This trained module can then be reused for similar problems, which makes this approach promising for the animation of autonomous, yet reactive characters in simulators, video games or virtual reality environments. This paper surveys the modern Deep Reinforcement Learning methods and discusses their possible applications in Character Animation, from skeletal control of a single, physically-based character to navigation controllers for individual agents and virtual crowds. It also describes the practical side of training DRL systems, comparing the different frameworks available to build such agents.
  • Item
    Film Directing for Computer Games and Animation
    (The Eurographics Association and John Wiley & Sons Ltd., 2021) Ronfard, Rémi; Bühler, Katja and Rushmeier, Holly
    Over the last forty years, researchers in computer graphics have proposed a large variety of theoretical models and computer implementations of a virtual film director, capable of creating movies from minimal input such as a screenplay or storyboard. The underlying film directing techniques are also in high demand to assist and automate the generation of movies in computer games and animation. The goal of this survey is to characterize the spectrum of applications that require film directing, to present a historical and up-to-date summary of research in algorithmic film directing, and to identify promising avenues and hot topics for future research.
  • Item
    A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
    (The Eurographics Association and John Wiley & Sons Ltd., 2023) Nyatsanga, Simbarashe; Kucherenko, Taras; Ahuja, Chaitanya; Henter, Gustav Eje; Neff, Michael; Bousseau, Adrien; Theobalt, Christian
    Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology for creating believable characters in film, games, and virtual social spaces, as well as for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. The field of gesture generation has seen surging interest in the last few years, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text and non-linguistic input. Concurrent with the exposition of deep learning approaches, we chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method (e.g., optical motion capture or pose estimation from video). Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.
  • Item
    Splash in a Flash: Sharpness-aware Minimization for Efficient Liquid Splash Simulation
    (The Eurographics Association, 2022) Jetly, Vishrut; Ibayashi, Hikaru; Nakano, Aiichiro; Sauvage, Basile; Hasic-Telalovic, Jasminka
    We present sharpness-aware minimization (SAM) for fluid dynamics which can efficiently learn the plausible dynamics of liquid splashes. Due to its ability to achieve robust and generalizing solutions, SAM efficiently converges to a parameter set that predicts plausible dynamics of elusive liquid splashes. Our training scheme requires 6 times smaller number of epochs to converge and, 4 times shorter wall-clock time. Our result shows that sharpness of loss function has a close connection to the plausibility of fluid dynamics and suggests further applicability of SAM to machine learning based fluid simulation.
  • Item
    Stroke based Painterly Inbetweening
    (The Eurographics Association, 2022) Barroso, Nicolas; Fondevilla, Amélie; Vanderhaeghe, David; Sauvage, Basile; Hasic-Telalovic, Jasminka
    Creating a 2D animation with visible strokes is a tedious and time consuming task for an artist. Computer aided animation usually focus on cartoon stylized rendering, or is built from an automatic process as 3D animations stylization, loosing the painterly look and feel of hand made animation. We propose to simplify the creation of stroke-based animations: from a set of key frames, our methods automatically generates intermediate frames to depict the animation. Each intermediate frame looks as it could have been drawn by an artist, using the same high level stroke based representation as key frame, and in succession they display the subtle temporal incoherence usually found in hand-made animations.
  • Item
    Neural Motion Compression with Frequency-adaptive Fourier Feature Network
    (The Eurographics Association, 2022) Tojo, Kenji; Chen, Yifei; Umetani, Nobuyuki; Pelechano, Nuria; Vanderhaeghe, David
    We present a neural-network-based compression method to alleviate the storage cost of motion capture data. Human motions such as locomotion, often consist of periodic movements. We leverage this periodicity by applying Fourier features to a multilayered perceptron network. Our novel algorithm finds a set of Fourier feature frequencies based on the discrete cosine transformation (DCT) of motion. During training, we incrementally added a dominant frequency of the DCT to a current set of Fourier feature frequencies until a given quality threshold was satisfied. We conducted an experiment using CMU motion dataset, and the results suggest that our method achieves overall high compression ratio while maintaining its quality.
  • Item
    Controllable Caustic Animation Using Vector Fields
    (The Eurographics Association, 2020) Rojo, Irene Baeza; Gross, Markus; Günther, Tobias; Wilkie, Alexander and Banterle, Francesco
    In movie production, lighting is commonly used to redirect attention or to set the mood in a scene. The detailed editing of complex lighting phenomena, however, is as tedious as it is important, especially with dynamic lights or when light is a relevant story element. In this paper, we propose a new method to create caustic animations, which are controllable through constraints drawn by the user. Our method blends caustics into a specified target image by treating photons as particles that move in a divergence-free fluid, an irrotational vector field or a linear combination of the two. Once described as a flow, additional user constraints are easily added, e.g., to direct the flow, create boundaries or add synthetic turbulence, which offers new ways to redirect and control light. The corresponding vector field is computed by fitting a stream function and a scalar potential per time step, for which constraints are described in a quadratic energy that we minimize as a linear least squares problem. Finally, photons are placed at their new positions back into the scene and are rendered with progressive photon mapping.
  • Item
    Safeguarding our Dance Cultural Heritage
    (The Eurographics Association, 2022) Aristidou, Andreas; Chalmers, Alan; Chrysanthou, Yiorgos; Loscos, Celine; Multon, Franck; Parkins, J. E.; Sarupuri, Bhuvan; Stavrakis, Efstathios; Hahmann, Stefanie; Patow, Gustavo A.
    Folk dancing is a key aspect of intangible cultural heritage that often reflects the socio-cultural and political influences prevailing in different periods and nations; each dance produces a meaning, a story with the help of music, costumes and dance moves. It has been transmitted from generation to generation, and to different countries, mainly due to movements of people carrying and disseminating their civilization. However, folk dancing, amongst other intangible heritage, is at high risk of disappearing due to wars, the moving of populations, economic crises, modernization, but most importantly, because these fragile creations have been modified over time through the process of collective recreation, and/or changes in the way of life. In this tutorial, we show how the European Project, SCHEDAR, exploited emerging technologies to digitize, analyze, and holistically document our intangible heritage creations, that is a critical necessity for the preservation and the continuity of our identity as Europeans.
  • Item
    DragPoser: Motion Reconstruction from Variable Sparse Tracking Signals via Latent Space Optimization
    (The Eurographics Association and John Wiley & Sons Ltd., 2025) Ponton, Jose Luis; Pujol, Eduard; Aristidou, Andreas; Andujar, Carlos; Pelechano, Nuria; Bousseau, Adrien; Day, Angela
    High-quality motion reconstruction that follows the user's movements can be achieved by high-end mocap systems with many sensors. However, obtaining such animation quality with fewer input devices is gaining popularity as it brings mocap closer to the general public. The main challenges include the loss of end-effector accuracy in learning-based approaches, or the lack of naturalness and smoothness in IK-based solutions. In addition, such systems are often finely tuned to a specific number of trackers and are highly sensitive to missing data, e.g., in scenarios where a sensor is occluded or malfunctions. In response to these challenges, we introduce DragPoser, a novel deep-learning-based motion reconstruction system that accurately represents hard and dynamic constraints, attaining real-time high end-effectors position accuracy. This is achieved through a pose optimization process within a structured latent space. Our system requires only one-time training on a large human motion dataset, and then constraints can be dynamically defined as losses, while the pose is iteratively refined by computing the gradients of these losses within the latent space. To further enhance our approach, we incorporate a Temporal Predictor network, which employs a Transformer architecture to directly encode temporality within the latent space. This network ensures the pose optimization is confined to the manifold of valid poses and also leverages past pose data to predict temporally coherent poses. Results demonstrate that DragPoser surpasses both IK-based and the latest data-driven methods in achieving precise end-effector positioning, while it produces natural poses and temporally coherent motion. In addition, our system showcases robustness against on-the-fly constraint modifications, and exhibits adaptability to various input configurations and changes. The complete source code, trained model, animation databases, and supplementary material used in this paper can be found at https://upc-virvig.github.io/DragPoser
  • Item
    Personalized Visual Dubbing through Virtual Dubber and Full Head Reenactment
    (The Eurographics Association, 2025) Jeon, Bobae; Paquette, Eric; Mudur, Sudhir; Popa, Tiberiu; Ceylan, Duygu; Li, Tzu-Mao
    Visual dubbing aims to modify facial expressions to ''lip-sync'' a new audio track. While person-generic talking head generation methods achieve expressive lip synchronization across arbitrary identities, they usually lack person-specific details and fail to generate high-quality results. Conversely, person-specific methods require extensive training. Our method combines the strengths of both methods by incorporating a virtual dubber, a person-generic talking head, as an intermediate representation. We then employ an autoencoder-based person-specific identity swapping network to transfer the actor identity, enabling fullhead reenactment that includes hair, face, ears, and neck. This eliminates artifacts while ensuring temporal consistency. Our quantitative and qualitative evaluation demonstrate that our method achieves a superior balance between lip-sync accuracy and realistic facial reenactment.