PARC: A Two-Stage Multi-Modal Framework for Point Cloud Completion

Cai, Yujiao; Su, Yuhao

PARC: A Two-Stage Multi-Modal Framework for Point Cloud Completion

dc.contributor.author	Cai, Yujiao	en_US
dc.contributor.author	Su, Yuhao	en_US
dc.contributor.editor	Christie, Marc	en_US
dc.contributor.editor	Pietroni, Nico	en_US
dc.contributor.editor	Wang, Yu-Shuen	en_US
dc.date.accessioned	2025-10-07T05:03:18Z
dc.date.available	2025-10-07T05:03:18Z
dc.date.issued	2025
dc.description.abstract	Point cloud completion is vital for accurate 3D reconstruction, yet real world scans frequently exhibit large structural gaps that compromise recovery. Meanwhile, in 2D vision, VAR (Visual Auto-Regression) has demonstrated that a coarse-to-fine ''nextscale prediction'' can significantly improve generation quality, inference speed, and generalization. Because this coarse-to-fine approach closely aligns with the progressive nature of filling missing geometry in point clouds, we were inspired to develop PARC (Patch-Aware Coarse-to-Fine Refinement Completion), a two-stage multimodal framework specifically designed for handling missing structures. In the pretraining stage, PARC leverages complete point clouds alongside a Patch-Aware Coarse-to- Fine Refinement (PAR) strategy and a Mixture-of-Experts (MoE) architecture to generate high-quality local fragments, thereby improving geometric structure understanding and feature representation quality. During finetuning, the model is adapted to partial scans, further enhancing its resilience to incomplete inputs. To address remaining uncertainties in areas with missing structure, we introduce a dual-branch architecture that incorporates image cues: point cloud and image features are extracted independently and then fused via the MoE with an alignment loss, allowing complementary modalities to guide reconstruction in occluded or missing regions. Experiments conducted on the ShapeNet-ViPC dataset show that PARC has achieved highly competitive performance. Code is available at https://github.com/caiyujiaocyj/PARC.	en_US
dc.description.number	7
dc.description.sectionheaders	Creating and Processing Point Clouds
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	44
dc.identifier.doi	10.1111/cgf.70266
dc.identifier.issn	1467-8659
dc.identifier.pages	10 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.70266
dc.identifier.uri	https://diglib.eg.org/handle/10.1111/cgf70266
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	CCS Concepts: Computer vision → Reconstruction; Machine learning → Neural networks; Information systems → Multimedia information systems; Multimedia and multimodal retrieval
dc.subject	Computer vision → Reconstruction
dc.subject	Machine learning → Neural networks
dc.subject	Information systems → Multimedia information systems
dc.subject	Multimedia and multimodal retrieval
dc.title	PARC: A Two-Stage Multi-Modal Framework for Point Cloud Completion	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: cgf70266.pdf
Size:: 1.7 MB
Format:: Adobe Portable Document Format

Download

Collections

44-Issue 7