8 results
Search Results
Now showing 1 - 8 of 8
Item UTOPIC: Uncertainty-aware Overlap Prediction Network for Partial Point Cloud Registration(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chen, Zhilei; Chen, Honghua; Gong, Lina; Yan, Xuefeng; Wang, Jun; Guo, Yanwen; Qin, Jing; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneHigh-confidence overlap prediction and accurate correspondences are critical for cutting-edge models to align paired point clouds in a partial-to-partial manner. However, there inherently exists uncertainty between the overlapping and non-overlapping regions, which has always been neglected and significantly affects the registration performance. Beyond the current wisdom, we propose a novel uncertainty-aware overlap prediction network, dubbed UTOPIC, to tackle the ambiguous overlap prediction problem; to our knowledge, this is the first to explicitly introduce overlap uncertainty to point cloud registration. Moreover, we induce the feature extractor to implicitly perceive the shape knowledge through a completion decoder, and present a geometric relation embedding for Transformer to obtain transformation-invariant geometry-aware feature representations.With the merits of more reliable overlap scores and more precise dense correspondences, UTOPIC can achieve stable and accurate registration results, even for the inputs with limited overlapping areas. Extensive quantitative and qualitative experiments on synthetic and real benchmarks demonstrate the superiority of our approach over state-of-the-art methods.Item TogetherNet: Bridging Image Restoration and Object Detection Together via Dynamic Enhancement Learning(The Eurographics Association and John Wiley & Sons Ltd., 2022) Wang, Yongzhen; Yan, Xuefeng; Zhang, Kaiwen; Gong, Lina; Xie, Haoran; Wang, Fu Lee; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneAdverse weather conditions such as haze, rain, and snow often impair the quality of captured images, causing detection networks trained on normal images to generalize poorly in these scenarios. In this paper, we raise an intriguing question - if the combination of image restoration and object detection, can boost the performance of cutting-edge detectors in adverse weather conditions. To answer it, we propose an effective yet unified detection paradigm that bridges these two subtasks together via dynamic enhancement learning to discern objects in adverse weather conditions, called TogetherNet. Different from existing efforts that intuitively apply image dehazing/deraining as a pre-processing step, TogetherNet considers a multi-task joint learning problem. Following the joint learning scheme, clean features produced by the restoration network can be shared to learn better object detection in the detection network, thus helping TogetherNet enhance the detection capacity in adverse weather conditions. Besides the joint learning architecture, we design a new Dynamic Transformer Feature Enhancement module to improve the feature extraction and representation capabilities of TogetherNet. Extensive experiments on both synthetic and real-world datasets demonstrate that our TogetherNet outperforms the state-of-the-art detection approaches by a large margin both quantitatively and qualitatively. Source code is available at https://github.com/yz-wang/TogetherNet.Item SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation(The Eurographics Association and John Wiley & Sons Ltd., 2022) Pan, Haoran; Zhou, Jun; Liu, Yuanpeng; Lu, Xuequan; Wang, Weiming; Yan, Xuefeng; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, Etienne6D pose estimation of rigid objects from RGB-D images is crucial for object grasping and manipulation in robotics. Although RGB channels and the depth (D) channel are often complementary, providing respectively the appearance and geometry information, it is still non-trivial on how to fully benefit from the two cross-modal data. From the simple yet new observation, when an object rotates, its semantic label is invariant to the pose while its keypoint offset direction is variant to the pose. To this end, we present SO(3)-Pose, a new representation learning network to explore SO(3)-equivariant and SO(3)-invariant features from the depth channel for pose estimation. The SO(3)-invariant features facilitate to learn more distinctive representations for segmenting objects with similar appearance from RGB channels. The SO(3)-equivariant features communicate with RGB features to deduce the (missed) geometry for detecting keypoints of an object with the reflective surface from the depth channel. Unlike most of existing pose estimation methods, our SO(3)-Pose not only implements the information communication between the RGB and depth channels, but also naturally absorbs the SO(3)-equivariance geometry knowledge from depth images, leading to better appearance and geometry representation learning. Comprehensive experiments show that our method achieves the stateof- the-art performance on three benchmarks. Code is available at https://github.com/phaoran9999/SO3-Pose.Item Semi-MoreGAN: Semi-supervised Generative Adversarial Network for Mixture of Rain Removal(The Eurographics Association and John Wiley & Sons Ltd., 2022) Shen, Yiyang; Wang, Yongzhen; Wei, Mingqiang; Chen, Honghua; Xie, Haoran; Cheng, Gary; Wang, Fu Lee; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneReal-world rain is a mixture of rain streaks and rainy haze. However, current efforts formulate image rain streaks removal and rainy haze removal as separated models, worsening the loss of image details. This paper attempts to solve the mixture of rain removal problem in a single model by estimating the scene depths of images. To this end, we propose a novel SEMIsupervised Mixture Of rain REmoval Generative Adversarial Network (Semi-MoreGAN). Unlike most of existing methods, Semi-MoreGAN is a joint learning paradigm of mixture of rain removal and depth estimation; and it effectively integrates the image features with the depth information for better rain removal. Furthermore, it leverages unpaired real-world rainy and clean images to bridge the gap between synthetic and real-world rain. Extensive experiments show clear improvements of our approach over twenty representative state-of-the-arts on both synthetic and real-world rainy images. Source code is available at https://github.com/syy-whu/Semi-MoreGAN.Item Contrastive Semantic-Guided Image Smoothing Network(The Eurographics Association and John Wiley & Sons Ltd., 2022) Wang, Jie; Wang, Yongzhen; Feng, Yidan; Gong, Lina; Yan, Xuefeng; Xie, Haoran; Wang, Fu Lee; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneImage smoothing is a fundamental low-level vision task that aims to preserve salient structures of an image while removing insignificant details. Deep learning has been explored in image smoothing to deal with the complex entanglement of semantic structures and trivial details. However, current methods neglect two important facts in smoothing: 1) naive pixel-level regression supervised by the limited number of high-quality smoothing ground-truth could lead to domain shift and cause generalization problems towards real-world images; 2) texture appearance is closely related to object semantics, so that image smoothing requires awareness of semantic difference to apply adaptive smoothing strengths. To address these issues, we propose a novel Contrastive Semantic-Guided Image Smoothing Network (CSGIS-Net) that combines both contrastive prior and semantic prior to facilitate robust image smoothing. The supervision signal is augmented by leveraging undesired smoothing effects as negative teachers, and by incorporating segmentation tasks to encourage semantic distinctiveness. To realize the proposed network, we also enrich the original VOC dataset with texture enhancement and smoothing labels, namely VOC-smooth, which first bridges image smoothing and semantic segmentation. Extensive experiments demonstrate that the proposed CSGIS-Net outperforms state-of-the-art algorithms by a large margin. Code and dataset are available at https://github.com/wangjie6866/CSGIS-Net.Item GlassNet: Label Decoupling‐based Three‐stream Neural Network for Robust Image Glass Detection(© 2022 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2022) Zheng, Chengyu; Shi, Ding; Yan, Xuefeng; Liang, Dong; Wei, Mingqiang; Yang, Xin; Guo, Yanwen; Xie, Haoran; Hauser, Helwig and Alliez, PierreMost of the existing object detection methods generate poor glass detection results, due to the fact that the transparent glass shares the same appearance with arbitrary objects behind it in an image. Different from traditional deep learning‐based wisdoms that simply use the object boundary as an auxiliary supervision, we exploit label decoupling to decompose the original labelled ground‐truth (GT) map into an interior‐diffusion map and a boundary‐diffusion map. The GT map in collaboration with the two newly generated maps breaks the imbalanced distribution of the object boundary, leading to improved glass detection quality. We have three key contributions to solve the transparent glass detection problem: (1) We propose a three‐stream neural network (call GlassNet for short) to fully absorb beneficial features in the three maps. (2) We design a multi‐scale interactive dilation module to explore a wider range of contextual information. (3) We develop an attention‐based boundary‐aware feature Mosaic module to integrate multi‐modal information. Extensive experiments on the benchmark dataset exhibit clear improvements of our method over SOTAs, in terms of both the overall glass detection accuracy and boundary clearness.Item SPCNet: Stepwise Point Cloud Completion Network(The Eurographics Association and John Wiley & Sons Ltd., 2022) Hu, Fei; Chen, Honghua; Lu, Xuequan; Zhu, Zhe; Wang, Jun; Wang, Weiming; Wang, Fu Lee; Wei, Mingqiang; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneHow will you repair a physical object with large missings? You may first recover its global yet coarse shape and stepwise increase its local details. We are motivated to imitate the above physical repair procedure to address the point cloud completion task.We propose a novel stepwise point cloud completion network (SPCNet) for various 3D models with large missings. SPCNet has a hierarchical bottom-to-up network architecture. It fulfills shape completion in an iterative manner, which 1) first infers the global feature of the coarse result; 2) then infers the local feature with the aid of global feature; and 3) finally infers the detailed result with the help of local feature and coarse result. Beyond the wisdom of simulating the physical repair, we newly design a cycle loss to enhance the generalization and robustness of SPCNet. Extensive experiments clearly show the superiority of our SPCNet over the state-of-the-art methods on 3D point clouds with large missings. Code is available at https://github.com/1127368546/SPCNet.Item MODNet: Multi-offset Point Cloud Denoising Network Customized for Multi-scale Patches(The Eurographics Association and John Wiley & Sons Ltd., 2022) Huang, Anyi; Xie, Qian; Wang, Zhoutao; Lu, Dening; Wei, Mingqiang; Wang, Jun; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneThe intricacy of 3D surfaces often results cutting-edge point cloud denoising (PCD) models in surface degradation including remnant noise, wrongly-removed geometric details. Although using multi-scale patches to encode the geometry of a point has become the common wisdom in PCD, we find that simple aggregation of extracted multi-scale features can not adaptively utilize the appropriate scale information according to the geometric information around noisy points. It leads to surface degradation, especially for points close to edges and points on complex curved surfaces. We raise an intriguing question - if employing multi-scale geometric perception information to guide the network to utilize multi-scale information, can eliminate the severe surface degradation problem? To answer it, we propose a Multi-offset Denoising Network (MODNet) customized for multi-scale patches. First, we extract the low-level feature of three scales patches by patch feature encoders. Second, a multi-scale perception module is designed to embed multi-scale geometric information for each scale feature and regress multi-scale weights to guide a multi-offset denoising displacement. Third, a multi-offset decoder regresses three scale offsets, which are guided by the multi-scale weights to predict the final displacement by weighting them adaptively. Experiments demonstrate that our method achieves new state-of-the-art performance on both synthetic and real-scanned datasets. Our code is publicly available at https://github.com/hay-001/MODNet.