Habibie, IkhsanulElgharib, MohamedLuvizon, DiogoThambiraja, BalamuruganNyatsanga, SimbarasheThies, JustusNeff, MichaelTheobalt, ChristianLinsen, LarsThies, Justus2024-09-092024-09-092024978-3-03868-247-9https://doi.org/10.2312/vmv.20241209https://diglib.eg.org/handle/10.2312/vmv20241209We present COMAND, a novel method for controllable multi-action 3D motion synthesis without requiring action-labeled data. Our method can generate a lifelike motion sequence containing consecutive non-locomotive actions such as kicking, jumping, or squatting, without the need for manual blending, enabling an intuitive way to control 3D human animation based on the desired motion types at specified time windows. At the core of our method is a motion manifold based on a periodic parameterization of a motion latent space that allows for unsupervised action clustering of 3D motion, thus allowing action-to-motion synthesis without the need to explicitly train the model on action-labeled datasets. This learned motion manifold has semantic and periodic properties that benefit 3D motion synthesis from action labels and from free-form text input, resulting in a state-ofthe- art multi-modal and multi-action 3D motion generation framework. Our study shows that more than 83% and 96% of the users respectively rated COMAND as more natural and better matching the target action sequence when compared to existing methods.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies → Motion captureComputing methodologies → Motion captureCOMAND: Controllable Action-aware Manifold for 3D Motion Synthesis10.2312/vmv.202412098 pages