PF-UCDR: A Local-Aware RGB-Phase Fusion Network with Adaptive Prompts for Universal Cross-Domain Retrieval

Loading...
Thumbnail Image
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Universal Cross-Domain Retrieval (UCDR) aims to match semantically related images across domains and categories not seen during training. While vision-language pre-trained models offer strong global alignment, we are inspired by the observation that local structures, such as shapes, contours, and textures, often remain stable across domains, and thus propose to model them explicitly at the patch level. We present PF-UCDR, a framework built upon frozen vision-language backbones that performs patch-wise fusion of RGB and phase representations. Central to our design is a Fusing Vision Encoder, which applies masked cross-attention to spatially aligned RGB and phase patches, enabling fine-grained integration of complementary appearance and structural cues. Additionally, we incorporate adaptive visual prompts that condition image encoding based on domain and class context. Local and global fusion modules aggregate these enriched features, and a two-stage training strategy progressively optimizes alignment and retrieval objectives. Experiments on standard UCDR benchmarks demonstrate that PF-UCDR significantly outperforms existing methods, validating the effectiveness of structure-aware local fusion grounded in multimodal pretraining. Our code is publicly available at https://github.com/djzgroup/PF-UCDR.
Description

CCS Concepts: Computing methodologies → Computer vision tasks; Visual content-based indexing and retrieval; Image representations

        
@inproceedings{
10.2312:pg.20251279
, booktitle = {
Pacific Graphics Conference Papers, Posters, and Demos
}, editor = {
Christie, Marc
and
Han, Ping-Hsuan
and
Lin, Shih-Syun
and
Pietroni, Nico
and
Schneider, Teseo
and
Tsai, Hsin-Ruey
and
Wang, Yu-Shuen
and
Zhang, Eugene
}, title = {{
PF-UCDR: A Local-Aware RGB-Phase Fusion Network with Adaptive Prompts for Universal Cross-Domain Retrieval
}}, author = {
Wu, Yiqi
and
Hu, Ronglei
and
Wu, Huachao
and
He, Fazhi
and
Zhang, Dejun
}, year = {
2025
}, publisher = {
The Eurographics Association
}, ISBN = {
978-3-03868-295-0
}, DOI = {
10.2312/pg.20251279
} }
Citation