LexiCrowd: A Learning Paradigm towards Text to Behaviour Parameters for Crowds

Lemonari, MarilenaAndreou, NefeliPelechano, NuriaCharalambous, PanayiotisChrysanthou, YiorgosPelechano, NuriaPettré, Julien2024-04-302024-04-302024978-3-03868-241-7https://doi.org/10.2312/cl.20241049https://diglib.eg.org/handle/10.2312/cl20241049Creating believable virtual crowds, controllable by high-level prompts, is essential to creators for trading-off authoring freedom and simulation quality. The flexibility and familiarity of natural language in particular, motivates the use of text to guide the generation process. Capturing the essence of textually described crowd movements in the form of meaningful and usable parameters, is challenging due to the lack of paired ground truth data, and inherent ambiguity between the two modalities. In this work, we leverage a pre-trained Large Language Model (LLM) to create pseudo-pairs of text and behaviour labels. We train a variational auto-encoder (VAE) on the synthetic dataset, constraining the latent space into interpretable behaviour parameters by incorporating a latent label loss. To showcase our model's capabilities, we deploy a survey where humans provide textual descriptions of real crowd datasets. We demonstrate that our model is able to parameterise unseen sentences and produce novel behaviours, capturing the essence of the given sentence; our behaviour space is compatible with simulator parameters, enabling the generation of plausible crowds (text-to-crowds). Also, we conduct feasibility experiments exhibiting the potential of the output text embeddings in the premise of full sentence generation from a behaviour profile.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies → Neural networks; Natural language processing; Computer graphicsComputing methodologies → Neural networksNatural language processingComputer graphicsLexiCrowd: A Learning Paradigm towards Text to Behaviour Parameters for Crowds10.2312/cl.202410499 pages