Xie, ZhaomingTseng, JonathanStarke, SebastianPanne, Michiel van deLiu, C. KarenWang, HuaminYe, YutingVictor Zordan2023-10-162023-10-1620232577-6193https://doi.org/10.1145/3606931https://diglib.eg.org:443/handle/10.1145/3606931Humans perform everyday tasks using a combination of locomotion and manipulation skills. Building a system that can handle both skills is essential to creating virtual humans. We present a physically-simulated human capable of solving box rearrangement tasks, which requires a combination of both skills. We propose a hierarchical control architecture, where each level solves the task at a different level of abstraction, and the result is a physics-based simulated virtual human capable of rearranging boxes in a cluttered environment. The control architecture integrates a planner, diffusion models, and physics-based motion imitation of sparse motion clips using deep reinforcement learning. Boxes can vary in size, weight, shape, and placement height. Code and trained control policies are provided.CCS Concepts: Computing methodologies -> Physical simulation; Reinforcement learning character animation, loco-manipulation"Computing methodologiesPhysical simulationReinforcement learning character animationlocomanipulation"Hierarchical Planning and Control for Box Loco-Manipulation10.1145/3606931