Jang, HojunBae, JinseokKim, Young MinCeylan, DuyguLi, Tzu-Mao2025-05-092025-05-092025978-3-03868-268-41017-4656https://doi.org/10.2312/egs.20251045https://diglib.eg.org/handle/10.2312/egs20251045Physics-based character control generates realistic motion dynamics by leveraging kinematic priors from large-scale data within a simulation engine. The simulated motion respects physical plausibility, while dynamic cues like contacts and forces guide compelling human-scene interaction. However, leveraging audio cues, which can capture physical contacts in a costeffective way, has been less explored in animating human motions. In this work, we demonstrate that audio inputs can enhance accuracy in predicting footsteps and capturing human locomotion dynamics. Experiments validate that audio-aided control from sparse observations (e.g., an IMU sensor on a VR headset) enhances the prediction accuracy of contact dynamics and motion tracking, offering a practical auxiliary signal for robotics, gaming, and virtual environments.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies → Simulation by animation; Physical simulation; Motion processingComputing methodologies → Simulation by animationPhysical simulationMotion processingAudio-aided Character Control for Inertial Measurement Tracking10.2312/egs.202510454 pages