HiLo-Align: A Hierarchical Semantic Alignment Framework for Driving Decision Generation via Virtual-Physical Integration

Duan, ZehaoHuang, ChengyanWang, LinChristie, MarcHan, Ping-HsuanLin, Shih-SyunPietroni, NicoSchneider, TeseoTsai, Hsin-RueyWang, Yu-ShuenZhang, Eugene2025-10-072025-10-072025978-3-03868-295-0https://doi.org/10.2312/pg.20251263https://diglib.eg.org/handle/10.2312/pg20251263Autonomous vehicles operating in uncertain urban environments are required to reason over complex multi-agent interactions while adhering to stringent safety requirements. Hierarchical frameworks often use large models for high-level (virtual-layer) planning and deep reinforcement learning for low-level (physical-layer) control. However, semantic and temporal misalignment between layers leads to command errors and delayed response. We propose HiLo-Align, a hybrid hierarchical framework that unifies both layers via a shared semantic space and time scale. By explicitly modeling cross-layer alignment, HiLo-Align improves control coordination and semantic consistency. Experimental results on both simulation and real-world datasets indicate enhanced collision avoidance, generalization, and robustness in high-risk urban environments.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies → Machine learning; Reinforcement learning; Semantic networks; Applied computing → TransportationComputing methodologies → Machine learningReinforcement learningSemantic networksApplied computing → TransportationHiLo-Align: A Hierarchical Semantic Alignment Framework for Driving Decision Generation via Virtual-Physical Integration10.2312/pg.2025126311 pages