Hybrid Sparse Transformer and Feature Alignment for Efficient Image Completion

Chen, L.; Sun, Hao

Hybrid Sparse Transformer and Feature Alignment for Efficient Image Completion

dc.contributor.author	Chen, L.	en_US
dc.contributor.author	Sun, Hao	en_US
dc.contributor.editor	Christie, Marc	en_US
dc.contributor.editor	Pietroni, Nico	en_US
dc.contributor.editor	Wang, Yu-Shuen	en_US
dc.date.accessioned	2025-10-07T05:02:43Z
dc.date.available	2025-10-07T05:02:43Z
dc.date.issued	2025
dc.description.abstract	In this paper, we propose an efficient single-stage hybrid architecture for image completion. Existing transformer-based image completion methods often struggle with accurate content restoration, largely due to their ineffective modeling of corrupted channel information and the attention noise introduced by softmax-based mechanisms, which results in blurry textures and distorted structures. Additionally, these methods frequently fail to maintain texture consistency, either relying on imprecise mask sampling or incurring substantial computational costs from complex similarity calculations. To address these limitations, we present two key contributions: a Hybrid Sparse Self-Attention (HSA) module and a Feature Alignment Module (FAM). The HSA module enhances structural recovery by decoupling spatial and channel attention with sparse activation, while the FAM enforces texture consistency by aligning encoder and decoder features via a mask-free, energy-gated mechanism without additional inference cost. Our method achieves state-of-the-art image completion results with the fastest inference speed among single-stage networks, as measured by PSNR, SSIM, FID, and LPIPS on CelebA-HQ, Places2, and Paris datasets.	en_US
dc.description.number	7
dc.description.sectionheaders	Image Creation & Augmentation
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	44
dc.identifier.doi	10.1111/cgf.70255
dc.identifier.issn	1467-8659
dc.identifier.pages	10 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.70255
dc.identifier.uri	https://diglib.eg.org/handle/10.1111/cgf70255
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	CCS Concepts: Computing methodologies → Image processing; Computer vision tasks; Image Completion; Machine learning; Neural networks
dc.subject	Computing methodologies → Image processing
dc.subject	Computer vision tasks
dc.subject	Image Completion
dc.subject	Machine learning
dc.subject	Neural networks
dc.title	Hybrid Sparse Transformer and Feature Alignment for Efficient Image Completion	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: cgf70255.pdf
Size:: 16.69 MB
Format:: Adobe Portable Document Format

Download

Name:: paper1288_mm.pdf
Size:: 25.89 MB
Format:: Adobe Portable Document Format

Download

Collections

44-Issue 7