Coupling Self-Distillation with Test Time Augmentation for effective LiDAR-Based 3D Semantic Segmentation

Antonarakos, DimitriosZamanakos, GeorgiosPapadeas, IliasPratikakis, IoannisGuerrero, PaulPratikakis, IoannisVeltkamp, Remco2025-08-292025-08-292025978-3-03868-280-61997-0471https://doi.org/10.2312/3dor.20251201https://diglib.eg.org/handle/10.2312/3dor20251201Effective 3D perception is fundamental for spatial awareness and safe navigation in modern autonomous systems, with 3D semantic segmentation of LiDAR point clouds being a critical perception task. Recent progress in 2D vision highlights the potential of non-architectural training and inference strategies to further boost model performance. Inspired by consistency-based learning and self-distillation, this work employs such a training pipeline for robust 3D semantic segmentation in street scene understanding. Specifically, we incorporate a teacher-student knowledge self-distillation framework that integrates Test-Time Augmentation to enhance the quality of the soft labels generated by the teacher model during training and to improve inference performance. We present a comparative study on the effectiveness of the employed framework across both convolutional and attention-enhanced networks. Experimental results on the Street3D benchmark dataset demonstrate that the adopted training framework coupled with attention-enhanced networks compares favorably with the state-of-the-art for 3D semantic segmentation in the context of autonomous driving. Code is available at https://github.com/DUTH-VCG/Self_Distillation_with_TTA-mainAttribution 4.0 International LicenseCoupling Self-Distillation with Test Time Augmentation for effective LiDAR-Based 3D Semantic Segmentation10.2312/3dor.202512017 pages