Search Results

Now showing 1 - 10 of 35
  • Item
    Towards a Neural Graphics Pipeline for Controllable Image Generation
    (The Eurographics Association and John Wiley & Sons Ltd., 2021) Chen, Xuelin; Cohen-Or, Daniel; Chen, Baoquan; Mitra, Niloy J.; Mitra, Niloy and Viola, Ivan
    In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. To form an image, NGP generates coarse 3D models that are fed into neural rendering modules to produce view-specific interpretable 2D maps, which are then composited into the final output image using a traditional image formation model. Our approach offers control over image generation by providing direct handles controlling illumination and camera parameters, in addition to control over shape and appearance variations. The key challenge is to learn these controls through unsupervised training that links generated coarse 3D models with unpaired real images via neural and traditional (e.g., Blinn- Phong) rendering functions, without establishing an explicit correspondence between them. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes. We evaluate our hybrid modeling framework, compare with neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs), report improvement in FID scores against real images, and demonstrate that NGP supports direct controls common in traditional forward rendering. Code is available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.
  • Item
    Interactive Facades - Analysis and Synthesis of Semi-Regular Facades
    (The Eurographics Association and Blackwell Publishing Ltd., 2013) AlHalawani, Sawsan; Yang, Yong-Liang; Liu, Han; Mitra, Niloy J.; I. Navazo, P. Poulin
    Urban facades regularly contain interesting variations due to allowed deformations of repeated elements (e.g., windows in different open or close positions) posing challenges to state-of-the-art facade analysis algorithms. We propose a semi-automatic framework to recover both repetition patterns of the elements and their individual deformation parameters to produce a factored facade representation. Such a representation enables a range of applications including interactive facade images, improved multi-view stereo reconstruction, facade-level change detection, and novel image editing possibilities.
  • Item
    Decomposing Single Images for Layered Photo Retouching
    (The Eurographics Association and John Wiley & Sons Ltd., 2017) Innamorati, Carlo; Ritschel, Tobias; Weyrich, Tim; Mitra, Niloy J.; Zwicker, Matthias and Sander, Pedro
    Photographers routinely compose multiple manipulated photos of the same scene into a single image, producing a fidelity difficult to achieve using any individual photo. Alternately, 3D artists set up rendering systems to produce layered images to isolate individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, these approaches either take considerable time and effort to capture, or remain limited to synthetic scenes. In this paper, we suggest a method to decompose a single image into multiple layers that approximates effects such as shadow, diffuse illumination, albedo, and specular shading. To this end, we extend the idea of intrinsic images along two axes: first, by complementing shading and reflectance with specularity and occlusion, and second, by introducing directional dependence. We do so by training a convolutional neural network (CNN) with synthetic data. Such decompositions can then be manipulated in any off-the-shelf image manipulation software and composited back. We demonstrate the effectiveness of our decomposition on synthetic (i. e., rendered) and real data (i. e., photographs), and use them for photo manipulations, which are otherwise impossible to perform based on single images. We provide comparisons with state-of-the-art methods and also evaluate the quality of our decompositions via a user study measuring the effectiveness of the resultant photo retouching setup. Supplementary material and code are available for research use at geometry.cs.ucl.ac.uk/projects/2017/layered-retouching.
  • Item
    Factored Neural Representation for Scene Understanding
    (The Eurographics Association and John Wiley & Sons Ltd., 2023) Wong, Yu-Shiang; Mitra, Niloy J.; Memari, Pooran; Solomon, Justin
    A long-standing goal in scene understanding is to obtain interpretable and editable representations that can be directly constructed from a raw monocular RGB-D video, without requiring specialized hardware setup or priors. The problem is significantly more challenging in the presence of multiple moving and/or deforming objects. Traditional methods have approached the setup with a mix of simplifications, scene priors, pretrained templates, or known deformation models. The advent of neural representations, especially neural implicit representations and radiance fields, opens the possibility of end-to-end optimization to collectively capture geometry, appearance, and object motion. However, current approaches produce global scene encoding, assume multiview capture with limited or no motion in the scenes, and do not facilitate easy manipulation beyond novel view synthesis. In this work, we introduce a factored neural scene representation that can directly be learned from a monocular RGB-D video to produce object-level neural presentations with an explicit encoding of object movement (e.g., rigid trajectory) and/or deformations (e.g., nonrigid movement). We evaluate ours against a set of neural approaches on both synthetic and real data to demonstrate that the representation is efficient, interpretable, and editable (e.g., change object trajectory). Code and data are available at: http://geometry.cs.ucl.ac.uk/projects/2023/factorednerf/.
  • Item
    Dynamic SfM: Detecting Scene Changes from Image Pairs
    (The Eurographics Association and John Wiley & Sons Ltd., 2015) Wang, Tuanfeng Y.; Kohli, Pushmeet; Mitra, Niloy J.; Mirela Ben-Chen and Ligang Liu
    Detecting changes in scenes is important in many scene understanding tasks. In this paper, we pursue this goal simply from a pair of image recordings. Specifically, our goal is to infer what the objects are, how they are structured, and how they moved between the images. The problem is challenging as large changes make point-level correspondence establishment difficult, which in turn breaks the assumptions of standard Structure-from-Motion (SfM). We propose a novel algorithm for dynamic SfM wherein we first generate a pool of potential corresponding points by hypothesizing over possible movements, and then use a continuous optimization formulation to obtain a low complexity solution that best explains the scene recordings, i.e., the input image pairs. We test the algorithm on a variety of examples to recover the multiple object structures and their changes.
  • Item
    Reforming Shapes for Material-aware Fabrication
    (The Eurographics Association and John Wiley & Sons Ltd., 2015) Yang, Yong-Liang; Wang, Jun; Mitra, Niloy J.; Mirela Ben-Chen and Ligang Liu
    As humans, we regularly associate shape of an object with its built material. In the context of geometric modeling, however, this inter-relation between form and material is rarely explored. In this work, we propose a novel datadriven reforming (i.e., reshaping) algorithm that adapts an input multi-component model for a target fabrication material. The algorithm adapts both the part geometry and the inter-part topology of the input shape to better align with material-aware fabrication requirements. As output, we produce the reshaped model along with respective part dimensions and inter-part junction specifications. We evaluate our algorithm on a range of man-made models and demonstrate a variety of model reshaping examples focusing only on metal and wooden materials.
  • Item
    Smart Variations: Functional Substructures for Part Compatibility
    (The Eurographics Association and Blackwell Publishing Ltd., 2013) Zheng, Youyi; Cohen-Or, Daniel; Mitra, Niloy J.; I. Navazo, P. Poulin
    As collections of 3D models continue to grow, reusing model parts allows generation of novel model variations. Naïvely swapping parts across models, however, leads to implausible results, especially when mixing parts across different model families. Hence, the user has to manually ensure that the final model remains functionally valid. We claim that certain symmetric functional arrangements (SFARR-s), which are special arrangements among symmetrically related substructures, bear close relation to object functions. Hence, we propose a purely geometric approach based on such substructures to match, replace, and position triplets of parts to create non-trivial, yet functionally plausible, model variations. We demonstrate that starting even from a small set of models such a simple geometric approach can produce a diverse set of non-trivial and plausible model variations.
  • Item
    Repetition Maximization based Texture Rectification
    (The Eurographics Association and John Wiley and Sons Ltd., 2012) Aiger, Dror; Cohen-Or, Daniel; Mitra, Niloy J.; P. Cignoni and T. Ertl
    Many photographs are taken in perspective. Techniques for rectifying resulting perspective distortions typically rely on the existence of parallel lines in the scene. In scenarios where such parallel lines are hard to automatically extract or manually annotate, the unwarping process remains a challenge. In this paper, we introduce an automatic algorithm to rectifying images containing textures of repeated elements lying on an unknown plane. We unwrap the input by maximizing for image self-similarity over the space of homography transformations. We map a set of detected regional descriptors to surfaces in a transformation space, compute the intersection points among triplets of such surfaces, and then use consensus among the projected intersection points to extract the correcting transform. Our algorithm is global, robust, and does not require explicit or accurate detection of similar elements. We evaluate our method on a variety of challenging textures and images. The rectified outputs are directly useful for various tasks including texture synthesis, image completion, etc.
  • Item
    Interactive Videos: Plausible Video Editing using Sparse Structure Points
    (The Eurographics Association and John Wiley & Sons Ltd., 2016) Chang, Chia-Sheng; Chu, Hung-Kuo; Mitra, Niloy J.; Joaquim Jorge and Ming Lin
    Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user-scribbles to structure the point representations obtained using structure-from-motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image-based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure-preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio-temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real-world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve.
  • Item
    Sketch-to-Design: Context-Based Part Assembly
    (The Eurographics Association and Blackwell Publishing Ltd., 2013) Xie, Xiaohua; Xu, Kai; Mitra, Niloy J.; Cohen-Or, Daniel; Gong, Wenyong; Su, Qi; Chen, Baoquan; Holly Rushmeier and Oliver Deussen
    Designing 3D objects from scratch is difficult, especially when the user intent is fuzzy and lacks a clear target form. We facilitate design by providing reference and inspiration from existing model contexts. We rethink model design as navigating through different possible combinations of part assemblies based on a large collection of pre‐segmented 3D models. We propose an interactive sketch‐to‐design system, where the user sketches prominent features of parts to combine. The sketched strokes are analysed individually, and more importantly, in context with the other parts to generate relevant shape suggestions via adesign galleryinterface. As a modelling session progresses and more parts get selected, contextual cues become increasingly dominant, and the model quickly converges to a final form. As a key enabler, we use pre‐learned part‐based contextual information to allow the user to quickly explore different combinations of parts. Our experiments demonstrate the effectiveness of our approach for efficiently designing new variations from existing shape collections.Designing 3D objects from scratch is difficult, especially when the user intent is fuzzy and lacks a clear target form. We facilitate design by providing reference and inspiration from existing model contexts. We rethink model design as navigating through different possible combinations of part assemblies based on a large collection of pre‐segmented 3D models. We propose an interactive sketch‐to‐design system, where the user sketches prominent features of parts to combine. The sketched strokes are analyzed individually, and more importantly, in context with the other parts to generate relevant shape suggestions via a design gallery interface. As a modeling session progresses and more parts get selected, contextual cues become increasingly dominant, and the model quickly converges to a final form. As a key enabler, we use pre‐learned part‐based contextual information to allow the user to quickly explore different combinations of parts.