PG2025 Conference Papers, Posters, and Demos

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607236

Browse

Now showing 1 - 20 of 61

3D Curve Development with Crossing and Twisting from 2D Drawings
(The Eurographics Association, 2025) Setiadi, Aurick Daniel Franciskus; Lean, Jeng Wen Joshua; Kao, Hao-Che; Hung, Shih-Hsuan; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Designing 3D curves with specified crossings and twistings often requires tedious view adjustments. We present a 3D curve development from 2D drawing with controlled crossings and twistings. We introduce a two-strand 2D diagram that lets users sketch with explicit crossing and twisting assignments. The system extracts feature points from the 2D diagram and uses them as 3D control points. It assigns the heights and over/under relationships of the control points via an optimization and then generates twisted 3D curves using B-splines. An interactive interface links the 2D diagram to the evolving 3D curves, enabling real-time iteration. We validate our method on diverse sketches, compare it with traditional 3D curve construction, and demonstrate its utility for elastic wire art via physics-based animation.
An Adaptive Particle Fission-Fusion Approach for Dual-Particle SPH Fluid
(The Eurographics Association, 2025) Liu, Shusen; Guo, Yuzhong; Qiao, Ying; He, Xiaowei; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Smoothed Particle Hydrodynamics (SPH) is a classical and popular method for fluid simulation, yet it is inherently susceptible to instabilities under tension or compression, which leads to significant visual artifacts. To overcome the limitation, an adaptive particle fission-fusion approach is proposed within the Dual-particle SPH framework. Specifically, in tension-dominant regions (e.g., fluid splashing), the velocity and pressure calculation points are decoupled to enhance tension stability, while in compression-dominant regions (e.g., fluid interiors), the velocity and pressure points are colocated to preserve compression stability. This adaptive configuration, together with modifications to the Dual-particle projection solver, allows for a unified treatment of fluid behavior across different stress regimes. Additionally, due to the reduced number of virtual particles and an optimized solver initialization, the proposed method achieves significant performance improvements compared to the original Dual-particle SPH method.
Animating Multi-Vehicle Interactions in Traffic Conflict Zones Using Operational Plans
(The Eurographics Association, 2025) Chang, Feng-Jui; Wong, Sai-Keung; Huang, Bo-Rui; Lin, Wen-Chieh; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
This paper introduces an agent-based method for generating animations of intricate vehicle interactions by regulating behaviors in conflict zones on non-signalized road segments. As vehicles move along their paths, they create sweeping regions representing the areas they may occupy. The method assigns operation plans to vehicles, regulating their crossing and yielding strategies within intersecting or merging conflict zones. This approach enables various vehicle interactions, combining basic actions such as acceleration, deceleration, keeping speed, and stopping. Experimental results demonstrate that our method generates plausible interaction behaviors in diverse road structures, including intersections, Y-junctions, and midblocks. This method could be beneficial for applications in traffic scenario planning, self-driving vehicles, driving training, and education.
Animating Vehicles Risk-Aware Interaction with Pedestrians Using Deep Reinforcement Learning
(The Eurographics Association, 2025) Tsai, Hao-Ming; Wong, Sai-Keung; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
This paper introduces a deep reinforcement learning-based system for ego vehicle control, enabling interaction with dynamic objects like pedestrians and animals. These objects display varied crossing behaviors, including sudden stops and directional shifts. The system uses a perception module to identify road structures, key pedestrians, inner wheel difference zones, and object movements. This allows the vehicle to make context-aware decisions, such as yielding, turning, or maintaining speed. The training process includes reward terms for speed, time, time-to-collision, and cornering to refine policy learning. Experiments show ego vehicles can adjust their behavior, such as decelerating or yielding, to avoid collisions. Ablation studies highlighted the importance of specific reward terms and state components. Animation results show that ego vehicles could safely interact with pedestrians or animals that exhibited sudden acceleration, mid-crossing directional changes, and abrupt stops.
Attention-Guided Multi-scale Neural Dual Contouring
(The Eurographics Association, 2025) Wu, Fuli; Hu, Chaoran; Li, Wenxuan; Hao, Pengyi; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Reconstructing high-quality meshes from binary voxel data is a fundamental task in computer graphics. However, existing methods struggle with low information density and strong discreteness, making it difficult to capture complex geometry and long-range boundary features, often leading to jagged surfaces and loss of sharp details.We propose an Attention-Guided Multiscale Neural Dual Contouring (AGNDC) method to address this challenge. AGNDC refines surface reconstruction through a multi-scale framework, using a hybrid feature extractor that combines global attention and dynamic snake convolution to enhance perception of long-range and high-curvature features. A dynamic feature fusion module aligns multi-scale predictions to improve local detail continuity, while a geometric postprocessing module further refines mesh boundaries and suppresses artifacts. Experiments on the ABC dataset demonstrate the superior performance of AGNDC in both visual and quantitative metrics. It achieves a Chamfer Distance (CD×105) of 9.013 and an F-score of 0.440, significantly reducing jaggedness and improving surface smoothness.
Avatar Animations and Audio Fillers for Managing Response Delays
(The Eurographics Association, 2025) Singaravelan, Gopi Krishnan; Lay, Zhi Lynn; Han, Ping-Hsuan; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
This study presents techniques for managing response delays in avatars with large language models (LLMs) to enhance user interaction. While existing avatar-based LLMs focus on human-like conversational abilities, they often overlook the impact of response delays on user experience. Our system strategically reframes these delays as opportunities to enhance the perceived humanness of the avatar by incorporating emotion-based animations, a companion pet, and contextually appropriate audio fillers. Through thoughtful audio-visual design and user interface enhancements during waiting periods, the demo showcases how effective delay management can sustain engagement, foster natural interactions, and turn waiting moments into meaningful elements of the conversational experience.
B2F: End-to-End Body-to-Face Motion Generation with Style Reference
(The Eurographics Association, 2025) Jang, Bokyung; Jung, Eunho; Lee, Yoonsang; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Human motion naturally integrates body movements and facial expressions, forming a unified perception. If a virtual character's facial expression does not align well with its body movements, it may weaken the perception of the character as a cohesive whole. Motivated by this, we propose B2F, a model that generates facial motions aligned with body movements. B2F takes a facial style reference as input, generating facial animations that reflect the provided style while maintaining consistency with the associated body motion. To achieve this, B2F learns a disentangled representation of content and style, using alignment and consistency-based objectives. We represent style using discrete latent codes learned via the Gumbel-Softmax trick, enabling diverse expression generation with a structured latent representation. B2F outputs facial motion in the FLAME format, making it compatible with SMPL-X characters, and supports ARKit-style avatars through a dedicated conversion module. Our evaluations show that B2F generates expressive and engaging facial animations that synchronize with body movements and style intent, while mitigating perceptual dissonance from mismatched cues, and generalizing across diverse characters and styles.
Body-Scale-Invariant Motion Embedding for Motion Similarity
(The Eurographics Association, 2025) Du, Xian; Quan, Chuyan; Yu, Ri; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Accurate measurement of motion similarity is crucial for applications in healthcare, rehabilitation, sports analysis, and human- computer interaction. However, existing Human Pose Estimation (HPE) approaches often conflate motion dynamics with anatomical variations, leading to body-scale-dependent similarity assessments. We propose a framework for learning bodyscale- invariant motion embeddings directly from RGB videos. Leveraging diverse 3D character animations with varied skeletal proportions, we generate standardized motion data and train the SAME model to capture temporal dynamics independent of body size. Our approach enables robust cross-character motion similarity evaluation. Experimental results show that the method effectively decouples kinematic patterns from structural differences, outperforming scale-sensitive baselines. Key contributions include: (1) a scalable motion data processing pipeline; (2) a learning-based body-scale-invariant embedding method; and (3) validation of motion similarity assessment independent of anatomy.
Breaking the Single-Stage Barrier: Synergistic Data-Model Adaptation at Test-Time for Medical Image Segmentation
(The Eurographics Association, 2025) Zhou, Wenjuan; Chen, Wei; He, Yulin; Wu, Di; Li, Chen; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Domain shift, predominantly caused by variations in medical imaging across different institutions, often leads to a decline in the accuracy of medical image segmentation models. While Test-Time Adaptation (TTA) holds promise to address this issue, existing methods exhibit significant limitations: model adaptation is prone to error accumulation and catastrophic forgetting in continuous domain learning. Meanwhile, data adaptation struggles to achieve deep latent alignment due to the inaccessibility of source domain data. To address these challenges, we propose Synergistic Data-Model Adaptation (SDMA), which innovatively leverages Batch Normalization (BN) layers as a bidirectional bridge to enable a two-stage joint adaptation process. In the data adaptation stage, domain-aware prompts dynamically adjust the BN statistics of incoming test data, achieving low-level distribution alignment in the Fourier space. In the model adaptation stage, we dynamically optimize the BN affine parameters based on strong-weak data augmentation and entropy minimization, enabling adaptation to high-level semantic features. Experiments conducted on five retinal fundus image datasets from various medical institutions demonstrate that our method achieves an average Dice improvement of 1.23% over previous state-of-the-art (SOTA) methods, establishing a new SOTA performance.
By-Example Synthesis of Vector Textures
(The Eurographics Association, 2025) Palazzolo, Christopher; Kaick, Oliver van; Mould, David; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
We propose a new method for synthesizing an arbitrarily sized novel vector texture given a single raster exemplar. In an analysis phase, our method first segments the exemplar to extract primary textons, secondary textons, and a palette of background colors. Then, it clusters the primary textons into categories based on visual similarity, and computes a descriptor to capture each texton's neighborhood and inter-category relationships. In the synthesis phase, our method first constructs a gradient field with a set of control points containing colors from the background palette. Next, it places primary textons based on the descriptors, in order to replicate a similar texton context as in the exemplar. The method also places secondary textons to complement the background detail. We compare our method to previous work with a wide range of perceptual-based metrics, and show that we are able to synthesize textures directly in vector format with quality similar to methods based on raster image synthesis.
C2Views: Knowledge-based Colormap Design for Multiple-View Consistency
(The Eurographics Association, 2025) Hou, Yihan; Ye, Yilin; Wang, Liangwei; Qu, Huamin; Zeng, Wei; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Multiple-view (MV) visualization provides a comprehensive and integrated perspective on complex data, establishing itself as an effective method for visual communication and exploratory data analysis. While existing studies have predominantly focused on designing explicit visual linkages and coordinated interactions to facilitate the exploration of MV visualizations, these approaches often demand extra graphical and interactive effort, overlooking the potential of color as an effective channel for encoding data and relationships. Addressing this oversight, we introduce C2Views, a new framework for colormap design that implicitly shows the relation across views. We begin by structuring the components and their relationships within MVs into a knowledge-based graph specification, wherein colormaps, data, and views are denoted as entities, and the interactions among them are illustrated as relations. Building on this representation, we formulate the design criteria as an optimization problem and employ a genetic algorithm enhanced by Pareto optimality, generating colormaps that balance single-view effectiveness and multiple-view consistency. Our approach is further complemented with an interactive interface for user-intended refinement. We demonstrate the feasibility of C2Views through various colormap design examples for MVs, underscoring its adaptability to diverse data relationships and view layouts. Comparative user studies indicate that our method outperforms the existing approach in facilitating color distinction and enhancing multiple-view consistency, thereby simplifying data exploration processes.
CGS: Continual Gaussian Splatting for Evolving 3D Scene Reconstruction
(The Eurographics Association, 2025) Yang, Shuojin; Chen, Haoxiang; Mu, Taijiang; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
3D Gaussian Splatting (3DGS) has gained significant attention for its fast optimization and high-quality rendering capabilities. However, in the context of continual scene reconstruction, optimizing newly observed regions often leads to degradation in previously reconstructed areas due to changes in camera viewpoints. To address this issue, we propose Continual Gaussian Splatting (CGS)-an efficient incremental reconstruction method that updates dynamic scenes using only a limited amount of new data while minimizing computational overhead. CGS is composed of three core components. First, we introduce a similarity-based registration algorithm that leverages the strong semantic understanding and translation invariance of pretrained Transformers to identify and align similar regions between new and existing scenes. These regions are then modeled as Gaussian Mixture Models (GMMs) to handle sparsity and outliers in point clouds, ensuring geometric consistency across scenes. Second, we propose Continual Gaussian Optimization (CGO), an importance-aware optimization strategy. By computing the Fisher Information Matrix, we evaluate the significance of each Gaussian point in the old scene and automatically restrict updates to those deemed critical, allowing only non-sensitive components to be adjusted. This ensures the preservation of the original scene while efficiently integrating new content. Finally, to address remaining issues such as geometric inconsistencies, blurring, and ghosting artifacts during optimization, we introduce a series of geometric regularization techniques. These terms guide the optimization toward geometrically coherent 3D structures, ultimately enhancing rendering quality. Extensive experiments demonstrate that CGS effectively mitigates forgetting and significantly improves overall reconstruction fidelity.
ChromBrain Wall: A Virtual Reality Game Featuring Customized Full-Body Movement for Long-Term Physical and Cognitive Training in Older Adults
(The Eurographics Association, 2025) Wu, Hao; Zhao, Juanjuan; Li, Aoyu; Qiang, Yan; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Aging brings challenges to the daily lives of older adults due to the decline in physical and cognitive functions. Although virtual reality (VR) exercise games can promote the physical and cognitive health of older adults, existing games are not suitable for personalized continuous training for the elderly due to unreasonable cognitive activation patterns, exercise task designs, and difficulty settings. To address this, we developed ChromaBrain Wall, a VR cognitive training exercise game with customized full-body movements, for the long-term exercise and cognitive inhibition training of healthy older adults. We then conducted an 8-month longitudinal user study on 40 older adults aged 65 and above, and the results showed that after the training, the older adults' exercise performance and cognitive inhibition abilities were significantly enhanced, and these benefits lasted for 6 months. Moreover, qualitative feedback indicated that the older adults had a positive attitude towards long-term use of ChromaBrain Wall, which increased their training motivation and compliance. This shows that ChromaBrain Wall has both short-term and long-term effects in enhancing the exercise performance and cognitive inhibition of older adults, providing a new approach for the health intervention of the elderly.
CMI-MTL: Cross-Mamba interaction based multi-task learning for medical visual question answering
(The Eurographics Association, 2025) Jin, Qiangguo; Zheng, Xianyao; Cui, Hui; Sun, Changming; Fang, Yuqi; Cong, Cong; Su, Ran; Wei, Leyi; Xuan, Ping; Wang, Junbo; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Medical visual question answering (Med-VQA) is a crucial multimodal task in clinical decision support and telemedicine. Recent self-attention based methods struggle to effectively handle cross-modal semantic alignments between vision and language. Moreover, classification-based methods rely on predefined answer sets. Treating this task as a simple classification problem may make it unable to adapt to the diversity of free-form answers and overlook the detailed semantic information of free-form answers. In order to tackle these challenges, we introduce a Cross-Mamba Interaction based Multi-Task Learning (CMI-MTL) framework that learns cross-modal feature representations from images and texts. CMI-MTL comprises three key modules: fine-grained visual-text feature alignment (FVTA), cross-modal interleaved feature representation (CIFR), and free-form answer-enhanced multi-task learning (FFAE). FVTA extracts the most relevant regions in image-text pairs through fine-grained visual-text feature alignment. CIFR captures cross-modal sequential interactions via cross-modal interleaved feature representation. FFAE leverages auxiliary knowledge from open-ended questions through free-form answerenhanced multi-task learning, improving the model's capability for open-ended Med-VQA. Experimental results show that CMI-MTL outperforms the existing state-of-the-art methods on three Med-VQA datasets: VQA-RAD, SLAKE, and OVQA. Furthermore, we conduct more interpretability experiments to prove the effectiveness. The code is publicly available at https://github.com/BioMedIA-repo/CMI-MTL.
CoSketcher: Collaborative and Iterative Sketch Generation with LLMs under Linguistic and Spatial Control
(The Eurographics Association, 2025) Mei, Liwen; Guan, Manhao; Zheng, Yifan; Zhang, Dongliang; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Sketching serves as both a medium for visualizing ideas and a process for creative iteration. While early neural sketch generation methods rely on category-specific data and lack generalization and iteration capability, recent advances in Large Language Models (LLMs) have opened new possibilities for more flexible and semantically guided sketching. In this work, we present CoSketcher, a controllable and iterative sketch generation system that leverages the prior knowledge and textual reasoning abilities of LLMs to align with the creative iteration process of human sketching. CoSketcher introduces a novel XML-style sketch language that represents stroke-level information in structured format, enabling the LLM to plan and generate complex sketches under both linguistic and spatial control. The system supports visual appealing sketch construction, including skeleton-contour decomposition for volumetric shapes and layout-aware reasoning for object relationships. Through extensive evaluation, we demonstrate that our method generates expressive sketches across both in-distribution and out-of-distribution categories, while also supporting scene-level composition and controllable iteration. Our method establishes a new paradigm for controllable sketch generation using off-the-shelf LLMs, with broad implications for creative human-AI collaboration.
DiffQN: Differentiable Quasi-Newton Method for Elastodynamics
(The Eurographics Association, 2025) Cai, Youshuai; Li, Chen; Song, Haichuan; Xie, Youchen; Wang, ChangBo; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
We propose DiffQN, an efficient differentiable quasi-Newton method for elastodynamics simulation, addressing the challenges of high computational cost and limited material generality in existing differentiable physics frameworks. Our approach employs a per-frame initial Hessian approximation and selectively delays Hessian updates, resulting in improved convergence and faster forward simulation compared to prior methods such as DiffPD. During backpropagation, we further reduce gradient evaluation costs by reusing prefactorized linear system solvers from the forward pass. Unlike previous approaches, our method supports a wide range of hyperelastic materials without restrictions on material energy functions, enabling the simulation of more general physical phenomena. To efficiently handle high-resolution systems with large degrees of freedom, we introduce a subspace optimization strategy that projects both forward simulation and backpropagation into a low-dimensional subspace, significantly improving computational and memory efficiency. Our subspace method can provide effective initial guesses for subsequent full-space optimization. We validate our framework on diverse applications, including system identification, initial state optimization, and facial animation, demonstrating robust performance and achieving up to 1.8× to 18.9× speedup over state-of-the-art methods.
Distance-Aware Tri-Perspective View for Efficient 3D Perception in Autonomous Driving
(The Eurographics Association, 2025) Tang, Yutao; Zhao, Jigang; Qin, Zhengrui; Qiu, Rui; Zhao, Lingying; Ren, Jie; Chen, Guangxi; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Three-dimensional environmental perception remains a critical bottleneck in autonomous driving, where existing vision-based dense representations face an intractable trade-off between spatial resolution and computational complexity. Current methods, including Bird's Eye View (BEV) and Tri-Perspective View (TPV), apply uniform perception precision across all spatial regions, disregarding the fundamental safety principle that near-field objects demand high-precision detection for collision avoidance while distant objects permit lower initial accuracy. This uniform treatment squanders computational resources and constrains real-time deployment. We introduce Distance-Aware Tri-Perspective View (DA-TPV), a novel framework that allocates computational resources proportional to operational risk. DA-TPV employs a hierarchical dual-plane architecture for each viewing direction: low-resolution planes capture global scene context while high-resolution planes deliver fine-grained perception within safety-critical reaction zones. Through distance-adaptive feature fusion, our method dynamically concentrates processing power where it most directly impacts vehicle safety. Extensive experiments on nuScenes demonstrate that DA-TPV matches or exceeds single high-resolution TPV performance while reducing memory consumption by 26.3% and achieving real-time inference. This work establishes distance-aware perception as a practical paradigm for deploying sophisticated three-dimensional understanding within automotive computational constraints. Code is available at https://github.com/yytang2012/DA-TPVFormer.
Easy Modeling of Man-Made Shapes in Virtual Reality
(The Eurographics Association, 2025) Tang, Haoyu; Gao, Fancheng; Choo, Kenny Tsu Wei; Bickel, Bernd; Song, Peng; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
Virtual Reality (VR) offers a promising platform for modeling man-made shapes by enabling immersive, hands-on interaction with these 3D shapes. Existing VR tools require either a complex user interface or a post-processing to model fabricable man-made shapes. In this paper, we present a VR tool that enables general users to interactively model man-made shapes for personalized fabrication, simply by using four common hand gestures as the interaction input. This is achieved by proposing an approach that models complex man-made shapes using a small set of geometric operations, and then designing a user interface that intuitively maps four common hand gestures to these operations. In our shape modeling approach, each shape part is modeled as a generalized cylinder with a specific shape type and iteratively assembled in a structure-aware manner to form a fabricable and usable man-made shape. In our user interface, each hand gesture is associated with a specific kind of interaction tasks and is intelligently utilized for performing the small set of operations to create, edit, and assemble generalized cylinders. A user study was conducted to demonstrate that our VR tool allows general users to effectively and creatively model a variety of man-made shapes, some of which have been 3D printed to validate their fabricability and usability.
ER-Diff: A Multi-Scale Exposure Residual-Guided Diffusion Model for Image Exposure Correction
(The Eurographics Association, 2025) Chen, TianZhen; Liu, Jie; Ru, Yi; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
This paper proposes an Exposure Residual-guided Diffusion Model (ER-Diff) to address the performance limitations of existing image restoration methods in handling non-uniform exposure. Current exposure correction techniques struggle with detail recovery in extreme over/underexposed regions and global exposure balancing. While diffusion models offer powerful generative capabilities for image restoration, effectively leveraging exposure information to guide the denoising process remains underexplored. Additionally, content reconstruction fidelity in severely degraded regions is challenging to ensure. To tackle these issues, ER-Diff explicitly constructs exposure residual features to guide the diffusion process. Specifically, we design a multi-scale exposure residual guidance module that first computes the residual between the input image and an ideally exposed reference, then transforms it into hierarchical feature representations via a multi-scale extraction network, and finally integrates these features progressively into the denoising process. This design enhances feature representation in locally distorted exposure areas while maintaining global exposure consistency. By decoupling content reconstruction and exposure correction, our method achieves more natural exposure adjustment with better detail preservation while ensuring content authenticity. Extensive experiments demonstrate that ER-Diff outperforms state-of-the-art exposure correction methods in both quantitative and qualitative evaluations, particularly in complex lighting conditions, effectively balancing detail retention and exposure correction.
Exploring Perceptual Homogenization through a VR-Based AI Narrative
(The Eurographics Association, 2025) Kao, Bing-Chen; Tsai, Tsun-Hung; Christie, Marc; Han, Ping-Hsuan; Lin, Shih-Syun; Pietroni, Nico; Schneider, Teseo; Tsai, Hsin-Ruey; Wang, Yu-Shuen; Zhang, Eugene
This research explores how the drive for cognitive efficiency in Artificial Intelligence (AI) may contribute to the homogenization of sensory experiences. We present Abstract.exe, a Virtual Reality (VR) installation designed as a critical medium for this inquiry. The experience places participants in a detailed virtual forest where their exploration triggers an AI-driven ''simplification'' of the world. Visuals, models, and lighting progressively degrade, aiming to transform the 3D scene into abstract 2D color fields. This work attempts to translate the abstract logic of AI-driven summarization into a tangible, immersive experience. This paper outlines the concept and technical implementation in Unreal Engine 5 (UE5), which utilizes a Procedural Content Generation (PCG) framework. Abstract.exe is intended as both an artistic inquiry and a cautionary exploration of how we might preserve experiential richness in an algorithmically influenced world.

Browse

Browsing PG2025 Conference Papers, Posters, and Demos by Title

Results Per Page

Sort Options